Python's regex engine performs pathologically on regexes like
'<!--.*?-->' when fed a large string of repeating '<!--' clauses.
Thanks to JaimeSlome / security@huntr.dev for the report; closes#1380.
Solved by switching to the Rust implementation, which does not suffer
from this issue.
entsToText(), minimizeHTML(), and the old regex constants have been
removed; they do not appear to be used by any add-ons.
The 'avoid showing learning card twice' logic is now only applied
when the next learning card was already due to be shown. This'll mean
there will be cases where a learning card does get shown twice near
the end, but it makes the behaviour easier to reason about, for both
us and end users.
Matches should arrive in alphabetical order. Currently results are not
capped (JS should be able to handle ~1k tags without too much hassle),
and no reordering based on match location is done. Matches are substring
based, and multiple can be provided, eg "foo::bar" will match
"foof::baz::abbar".
This is not hooked up properly on the frontend at the moment -
updateSuggestions() seems to be missing the most recently typed character,
and is not updating the list of completions half the time.
There were a few issues going on here:
- If some operation had invalidated the queues, they were subsequently
recreated with a call to .get_queues() in the undo handling code. This
could happen after the changes to the card had already been reverted,
leading to a queue state that didn't match our expectations.
- More generally, it's not safe to assume our mutations will apply
cleanly after the queue has been rebuilt. The next card will vary
depending on the number of remaining cards when interspersing cards of
different types, and a queue-invalidating operation will have changed
the learning cutoff.
So rather than rebuilding the queues on demand, we now check that they
already exist, and were created at the time we expect. If not, we
invalidate them and skip applying the mutations, and a subsequent
refresh of the UI should rebuild the queues correctly.
As part of this change, the cutoff snapshot has been moved into the
normal answer update object.
One possible downside here is that adding a note during review may cause
a newly due learning card to appear when undoing a different review.
If this proves to be a problem, we could potentially note down the
learning cutoff and apply it when queues are rebuilt later.
Context: https://forums.ankiweb.net/t/more-cards-today-question-about-v3/12400/10
Previously, interday learning cards and reviews were gathered at the
same time in v3, with the review limit being applied to both of them. The
order cards were gathered in would change the ratio of gathered learning
cards and reviews, but as they were displayed together in a single count,
a changing ratio was not apparent, and no special handling was required
by the deck tree code.
Showing interday learning cards in the learning count, while still
applying a review limit to them, makes things more complicated, as
a changing ratio will result in different counts. The deck tree code
is not able to know which order cards will appear in, so without changes,
we would have had a situation where the deck list may show different counts
to those seen when clicking on a deck.
One way to solve this would have been to introduce a separate limit for
interday learning cards. But this would have meant users needed to
juggle two different limits, instead of having a single one that controls
total number of (non-intraday) cards shown.
Instead, the scheduler now fetches interday cards prior to reviews -
the rationale for that order is that learning cards tend to be more
fragile/urgent than reviews. The option to show learning cards
before/after/mixed with reviews still exists, but it applies only after
cards have been capped to the daily limit.
To ensure the deck tree code matches the counts the scheduler gives,
it too applies limits to interday learning cards first, and reviews
afterwards.
In the old HTML editor, filenames were % escaped before feeding them to
beautifulsoup, causing bare ampersands to be left alone. The new HTML
editor reads content from the DOM, where a bare ampersand has been
transformed into an &, and that gets saved back into the field,
so the media check now needs to deal with it for images as well.
https://forums.ankiweb.net/t/causing-problems-with-image-names/12171
Interday learning cards are now counted in the learning count again,
and are no longer subject to the daily review limit.
The thinking behind the original change was that interday learning cards
are scheduled more like reviews, and counting them in the review count
would allow the learning count to focus on intraday learning - the red
number reflecting the fact that they are the most fragile memories. And
counting them together made it practical to apply the review limit
to both at once.
Since the release, there have been a number of users expecting to see
interday learning cards included in the learning count (the latest being
https://forums.ankiweb.net/t/feedback-and-a-feature-adjustment-request-for-2-1-45/12308),
and a good argument can be made for that too - they are, after all, listed
in the learning steps, and do tend to be harder than reviews. Short of
introducing another count to keep track of interday and intraday learning
separately, moving back to the old behaviour seems like the best move.
This also means it is not really practical to apply the review limit to
interday learning cards anymore, as the limit would be split between two
different numbers, and how much each number is capped would depend on
the order cards are introduced. The scheduler could figure this out, but
the deck list code does not know card order, and would need significant
changes to be able to produce numbers that matched the scheduler. And
even if we ignore implementation complexities, I think it would be more
difficult for users to reason about - the influence of the review limit
on new cards is confusing enough as it is.
The v3 scheduler will delay the final card from being shown twice in
a row, but the overdue case was being treated the same as the no-learning
case, leading to the message being hidden.
Previously we would just use 250% ease for any new card that had no
pre-configured ease, but this will result in decks that have
non-standard ease values to have "set due date" cards in them that don't
match. In order to make this somewhat more efficient, we cache
deckid->ease lookups during this operation.
Ref: <https://forums.ankiweb.net/t/set-due-date-doesnt-obey-default-ease-factor/9184>
Signed-off-by: Aleksa Sarai <cyphar@cyphar.com>
Unfortunately a popular note taking tool has been misusing cloze
markers in its deck exports. We may want to add this back in the
future, but we'll probably want to start by warning users, to give
people time to adjust.
In order to split backend.proto into a more manageable size, the protobuf
handling needed to be updated. This took more time than I would have
liked, as each language handles protobuf differently:
- The Python Protobuf code ignores "package" directives, and relies
solely on how the files are laid out on disk. While it would have been
nice to keep the generated files in a private subpackage, Protobuf gets
confused if the files are located in a location that does not match
their original .proto layout, so the old approach of storing them in
_backend/ will not work. They now clutter up pylib/anki instead. I'm
rather annoyed by that, but alternatives seem to be having to add an extra
level to the Protobuf path, making the other languages suffer, or trying
to hack around the issue by munging sys.modules.
- Protobufjs fails to expose packages if they don't start with a capital
letter, despite the fact that lowercase packages are the norm in most
languages :-( This required a patch to fix.
- Rust was the easiest, as Prost is relatively straightforward compared
to Google's tools.
The Protobuf files are now stored in /proto/anki, with a separate package
for each file. I've split backend.proto into a few files as a test, but
the majority of that work is still to come.
The Python Protobuf building is a bit of a hack at the moment, hard-coding
"proto" as the top level folder, but it seems to get the job done for now.
Also changed the workspace name, as there seems to be a number of Bazel
repos moving away from the more awkward reverse DNS naming style.
Instead of calling a method inside the transaction body, routines
can now pass Op::SkipUndo if they wish the changes to be discarded
at the end of the transaction. The advantage of doing it this way is
that the list of changes can still be returned, allowing the sync
indicator to update immediately.
Closes#1252
- changes can now be undone
- the same field can now be mapped to multiple target fields, allowing
fields to be cloned
- the old Qt dialog has been removed
- the old col.models.change() API calls the new code, to avoid
breaking existing consumers. It requires the field map to always
be passed in, but that appears to have been the common case.
- closes#1175
Multiple configs with the same inner id would lead to errors like the
following when trying to open the collection:
DeckConfigInner.interval_multiplier: invalid wire type: StartGroup (expected ThirtyTwoBit)
This makes the review backlog case more expensive, since we end up
shuffling items outside the daily limit, but for the common case it's
about the same speed, and it means we don't need two separate sorting
steps. New cards remain handled the same way, since a backlog
is common there.
Also ensures that interday learning cards honor the deck sorting, and
that the non-default sort orders shuffle at the end.
Like the previous change, avoid exposing the protobuf as a public API
for now. It requires more thought, and is probably better done with
either extra helper accessors like decks.name(), or via a native class.
Combine existing check for unparsable templates with a check for unknown
field names and a check for front sides without any field replacement.
Updating the notetype's fields now mutates the parsed templates, so the
checks can run on the final templates.
This could potentially help us avoid having to refetch the notetype
during study in the future, though updates to Note initialization and
the LaTeX handling would be required first.
- The "unbury deck" option was broken, as it was ignoring child
decks. It would be nice if we could use active_decks instead, but
plugging that into the old scheduler without breaking undo seems a bit
tricky.
- Remove the implicit From impl for decks, so we need to be forced to
think about whether we want child decks or not.
- Daily limits are no longer inherited - each deck limits its own
cards, and the selected deck enforces a maximum limit.
- Fetching of review cards now uses a single query, and sorts in advance.
In collections with a large number of overdue cards and decks, this is
faster than iterating over each deck in turn.
- Include interday learning count in review count & review limit, and
allow them to be buried.
- Warn when parent review limit is lower than child deck in deck options.
- Cap the new card limit to the review limit.
- Add option to control whether new card fetching short-circuits.
Instead of using a separate undo queue, the code now defers checking for
newly-due learning cards until the answering stage, and logs the updated
cutoff time as an undoable change, so that any newly-due learning cards
won't appear instead of a new/review card that was just undone.
Queue redo now uses a similar approach to undo, instead of rebuilding the
queues.
The original rationale was avoiding a possible O(n) insertion if
the learning card was due outside the cutoff, but the increased code
complexity doesn't seem worth it, given that learning cards will
rarely grow above 1000.
Also added a currently-disabled test that demonstrates the current undo
handling behaviour is yielding incorrect counts; that will be reworked
in the next commit, and this change will make that easier.
- split new card fetch order and subsequent sort order; use latter
when building queues
- default to spacing siblings when burying is off, with options to
show each sibling in turn, and shuffle the fetched cards
The bury new/review flags are now pulled from each card's home deck,
instead of using a global setting that had not been hooked up. This
unfortunately means we need to fetch the map of all decks up front, as
we need to be able to look up a deck configuration for cards that are
in filtered decks.
Fixes a "card was modified" error caused by cards being buried during
review, when they weren't removed up-front.
Avoids duplicate work, and is a step towards allowing the next
states to be modified by third-party code.
Also:
- fixed incorrect underlined count, due to reviews being labeled as
learning cards
- fixed reviewer not refreshing when undoing a test review, by splitting
up backend queue rebuilding from frontend reviewer refresh
- moved answering into a CollectionOp
Allows add-on authors to define their own label for a group of undoable
operations. For example:
def mark_and_bury(
*,
parent: QWidget,
card_id: CardId,
) -> CollectionOp[OpChanges]:
def op(col: Collection) -> OpChanges:
target = col.add_custom_undo_entry("Mark and Bury")
col.sched.bury_cards([card_id])
card = col.get_card(card_id)
col.tags.bulk_add(note_ids=[card.nid], tags="marked")
return col.merge_undo_entries(target)
return CollectionOp(parent, op)
The .add_custom_undo_entry() is for adding your own custom actions.
When extending a standard Anki action, instead store `target =
col.undo_status().last_step` after executing the standard operation.
This started out as a bigger refactor that required a separate
.commit_undoable() call to be run after each operation, instead of
having each operation return changes directly. But that proved to be
somewhat cumbersome in unit tests, and ran the risk of unexpected
behaviour if the caller invoked an operation without remembering to
finalize it.
- backend now updates current notetype as part of addition
- frontend no longer implicitly adds, so we can assign a new name and
add in a single operation
- tokio 1.0
- updated reqwest, thanks to Rumo
- other minor dep updates
the reqwest build file has been split into two, as it was awkward
to manually update the combined file, and the platform gate is now
on the target in rslib/
The deck name must be constructed by calling associated functions of
NativeDeckName, unless the name is guaranteed to be valid machine
name (like "Default").
NativeDeckName exposes methods to mutate the deck name and return
the human name.
The storage routines take &strs, but those should be slices of
NativeDeckNames to ensure machine form and normalization.
Instead, fetch the config order on the frontend and pass a builtin
variant into the backend.
That makes the following unnecessary:
* Resolving the config sort in search/mod.rs
* Deserializing the Column enum
* Config accessors for the sort columns
* Remove duplicate backend columns
* Remove duplicate column routines
* Move columns on frontend from state to model
* Generate available columns from Colum enum
* Add second column label for notes mode
- make sure we set flag in changes when config var changed
- move current deck get/set into backend
- set_config() now returns a bool indicating whether a change was
made, so other operations can be gated off it
- active decks generation is deferred until sched.reset()
remove_note() now returns the count of removed cards, allowing us
to unify the tooltip between browser and review screen
I've left the old translation in - we'll need to write a script at
one point that gathers all references to translations in the code,
and shows ones that are unused.
- pass the handler directly
- reviewer special-cases for flags and notes are now applied at
call site
- drop the kind attribute on OpChanges which is not needed
Like notetypes, there is a col.get_deck() routine which caches
fetches, so that successive fetches are cheap. This makes it simpler
to just fetch the deck at the start.
We were also attempting to fetch a deck with id 0 for each row; I've
changed this so that we only fetch it if the id is non-zero.
I18n uses an Arc internally, so it is cheap to clone. This allow us
to drop the lifetime specifiers on the context structures.
The backend knows exactly which op has executed, and it saves us having
to re-implement this logic on each client.
Fixes the browser table refreshing when toggling decks.
Updating a deck via protobuf is now exposed on the backend, but not
currently on the frontend - I suspect we'll be better off writing
separate routines for the actions we need instead, and we get a better
undo description for free.
This is currently causing an ugly redraw in the browse screen, which
will need fixing.
- use strum to generate an iterator for the protobuf enum so we don't
forget to add new labels if extending in the future
- no add-ons appear to be using dynOrderLabels(), so it has been removed
@RumovZ perhaps a similar approach might work for listing the available
browser columns as well?
Older translations will note have the $notetype variable, but that is
not an error in Fluent - it would only cause problems if we tried to
use the new string on older Anki versions.
So, this is fun. Apparently "DeckId" is considered preferable to the
"DeckID" were were using until now, and the latest clippy will start
warning about it. We could of course disable the warning, but probably
better to bite the bullet and switch to the naming that's generally
considered best.
It appears that including the build script as a dependency is not
enough to make the file available at runtime, so the build breaks
if ts/ is built first.
Instead of generating a fluent.proto file with a giant enum, create
a .json file representing the translations that downstream consumers
can use for code generation.
This enables the generation of a separate method for each translation,
with a docstring that shows the actual text, and any required arguments
listed in the function signature.
The codebase is still using the old enum for now; updating it will need
to come in future commits, and the old enum will need to be kept
around, as add-ons are referencing it.
Other changes:
- move translation code into a separate crate
- store the translations on a per-file/module basis, which will allow
us to avoid sending 1000+ strings on each JS page load in the future
- drop the undocumented support for external .ftl files, that we weren't
using
- duplicate strings in translation files are now checked for at build
time
- fix i18n test failing when run outside Bazel
- drop slog dependency in i18n module
- Filtered deck creation now happens as an atomic operation, and is
undoable.
- The logic for initial search text, normalizing searches and so on
has been pushed into the backend.
- Use protobuf to pass the filtered deck to the updated dialog, so
we don't need to deal with untyped JSON.
- Change the "revise your search?" prompt to be a simple info box -
user has access to cancel and build buttons, and doesn't need a separate
prompt. Tweak the wording so the 'show excluded' button should be more
obvious.
- Filtered decks have a time appended to them instead of a number,
primarily because it's easier to implement. No objections going back to
the old behaviour if someone wants to contribute a clean patch.
The standard de-duplication will happen if two decks are created in the
same minute with the same name.
- Tweak the default sort order, and start with two searches. The UI
will still hide the second search by default, but by starting with two,
the frontend doesn't need logic for creating the starting text.
- Search errors now have their own error type, instead of using
InvalidInput, as that was intended mainly for bad API calls. The markdown
conversion is done when the error is converted from the backend, allowing
errors to printed as a string without any special handling by the calling
code.
TODO: when building a new filtered deck, update_active() is clobbering
the undo log when the overview is refreshed