* Collection needs to be closed prior to backup even when not downgrading
* Backups -> BackupLimits
* Some improvements to backup_task
- backup_inner now returns the error instead of logging it, so that
the frontend can discover the issue when they await a backup (or create
another one)
- start_backup() was acquiring backup_task twice, and if another thread
started a backup between the two locks, the task could have been accidentally
overwritten without awaiting it
* Backups no longer require a collection close
- Instead of closing the collection, we ensure there is no active
transaction, and flush the WAL to disk. This means the undo history
is no longer lost on backup, which will be particularly useful if we
add a periodic backup in the future.
- Because a close is no longer required, backups are now achieved with
a separate command, instead of being included in CloseCollection().
- Full sync no longer requires an extra close+reopen step, and we now
wait for the backup to complete before proceeding.
- Create a backup before 'check db'
* Add File>Create Backup
https://forums.ankiweb.net/t/anki-mac-os-no-backup-on-sync/6157
* Defer checkpoint until we know we need it
When running periodic backups on a timer, we don't want to be fsync()ing
unnecessarily.
* Skip backup if modification time has not changed
We don't want the user leaving Anki open overnight, and coming back
to lots of identical backups.
* Periodic backups
Creates an automatic backup every 30 minutes if the collection has been
modified.
If there's a legacy checkpoint active, tries again 5 minutes later.
* Switch to a user-configurable backup duration
CreateBackup() now uses a simple force argument to determine whether
the user's limits should be respected or not, and only potentially
destructive ops (full download, check DB) override the user's configured
limit.
I considered having a separate limit for collection close and automatic
backups (eg keeping the previous 5 minute limit for collection close),
but that had two downsides:
- When the user closes their collection at the end of the day, they'd
get a recent backup. When they open the collection the next day, it
would get backed up again within 5 minutes, even though not much had
changed.
- Multiple limits are harder to communicate to users in the UI
Some remaining decisions I wasn't 100% sure about:
- If force is true but the collection has not been modified, the backup
will be skipped. If the user manually deleted their backups without
closing Anki, they wouldn't get a new one if the mtime hadn't changed.
- Force takes preference over the configured backup interval - should
we be ignored the user here, or take no backups at all?
Did a sneaky edit of the existing ftl string, as it hasn't been live
long.
* Move maybe_backup() into Collection
* Use a single method for manual and periodic backups
When manually creating a backup via the File menu, we no longer make
the user wait until the backup completes. As we continue waiting for
the backup in the background, if any errors occur, the user will get
notified about it fairly quickly.
* Show message to user if backup was skipped due to no changes
+ Don't incorrectly assert a backup will be created on force
* Add "automatic" to description
* Ensure we backup prior to importing colpkg if collection open
The backup doesn't happen when invoked from 'open backup' in the profile
screen, which matches Anki's previous behaviour. The user could
potentially clobber up to 30 minutes of their work if they exited to
the profile screen and restored a backup, but the alternative is we
create backups every time a backup is restored, which may happen a number
of times if the user is trying various ones. Or we could go back to a
separate throttle amount for this case, at the cost of more complexity.
* Remove the 0 special case on backup interval; minimum of 5 minutes
https://github.com/ankitects/anki/pull/1728#discussion_r830876833
* Write media files in chunks
* Test media file writing
* Add iter `ReadDirFiles`
* Remove ImportMediaError, fail fatally instead
Partially reverts commit f8ed4d89ba.
* Compare hashes of media files to be restored
* Improve `MediaCopier::copy()`
* Restore media files atomically with tempfile
* Make downgrade flag an enum
* Remove SchemaVersion::Latest in favour of Option
* Remove sha1 comparison again
* Remove unnecessary repr(u8) (dae)
* Fix legacy colpkg import; disable v3 import/export; add roundtrip test
The test has revealed we weren't decompressing the media files on v3
import. That's easy to fix, but means all files need decompressing
even when they already exist, which is not ideal - it would be better
to store size/checksum in the metadata instead.
* Switch media and meta to protobuf; re-enable v3 import/export
- Fixed media not being decompressed on import
- The uncompressed size and checksum is now included for each media
entry, so that we can quickly check if a given file needs to be extracted.
We're still just doing a naive size comparison on colpkg import at the
moment, but we may want to use a checksum in the future, and will need
a checksum for apkg imports.
- Checksums can't be efficiently encoded in JSON, so the media list
has been switched to protobuf to reduce the the space requirements.
- The meta file has been switched to protobuf as well, for consistency.
This will mean any colpkg files exported with beta7 will be
unreadable.
* Avoid integer version comparisons
* Re-enable v3 test
* Apply suggestions from code review
Co-authored-by: RumovZ <gp5glkw78@relay.firefox.com>
* Add export_colpkg() method to Collection
More discoverable, and easier to call from unit tests
* Split import/export code out into separate folders
Currently colpkg/*.rs contain some routines that will be useful for
apkg import/export as well; in the future we can refactor them into a
separate file in the parent module.
* Return a proper error when media import fails
This tripped me up when writing the earlier unit test - I had called
the equivalent of import_colpkg()?, and it was returning a string error
that I didn't notice. In practice this should result in the same text
being shown in the UI, but just skips the tooltip.
* Automatically create media folder on import
* Move roundtrip test into separate file; check collection too
* Remove zstd version suffix
Prevents a warning shown each time Rust Analyzer is used to check the
code.
Co-authored-by: RumovZ <gp5glkw78@relay.firefox.com>
* Implement colpkg exporting on backend
* Use exporting logic in backup.rs
* Refactor exporting.rs
* Add backend function to export collection
* Refactor backend/collection.rs
* Use backend for colpkg exporting
* Don't use default zip compression for media
* Add exporting progress
* Refactor media file writing
* Write dummy collections
* Localize dummy collection note
* Minimize dummy db size
* Use `NamedTempFile::new()` instead of `new_in`
* Drop redundant v2 dummy collection
* COLLECTION_VERSION -> PACKAGE_VERSION
* Split `lock_collection()` into two to drop flag
* Expose new colpkg in GUI
* Improve dummy collection message
* Please type checker
* importing-colpkg-too-new -> exporting-...
* Compress the media map in the v3 package (dae)
On collections with lots of media, it can grow into megabytes.
Also return an error in extract_media_file_names(), instead of masking
it as an optional.
* Store media map as a vector in the v3 package (dae)
This compresses better (eg 280kb original, 100kb hashmap, 42kb vec)
In the colpkg import case we don't need random access. When importing
an apkg, we will need to be able to fetch file data for a given media
filename, but the existing map doesn't help us there, as we need
filename->index, not index->filename.
* Ensure folders in the media dir don't break the file mapping (dae)
The _pb2 files are built for both the host and target architectures
(which seems superfluous - we may be able to fix that in the future).
Our script wrote the files into the build folder and then moved them
into the correct place, but because builds are not sandboxed on Windows,
the two actions were racy, and could cause each other to fail. Solved
by writing the files directly into their target locations.
Ideally this would have been in beta 6 :-) No add-ons appear to be
using customstudy.py/taglimit.py though, so it should hopefully not be
disruptive.
In the earlier custom study changes, we didn't get around to addressing
issue #1136. Now instead of trying to determine the maximum increase
to allow (which doesn't work correctly with nested decks), we just
present the total available to the user again, and let them decide. There's
plenty of room for improvement here still, but further work here might
be better done once we look into decoupling deck limits from deck presets.
Tags and available cards are fetched prior to showing the dialog now,
and will show a progress dialog if things take a while.
Tags are stored in an aux var now, so they don't inflate the deck
object size.
* Add forget prompt with options
- Restore original position
- Reset reps and lapses
* Restore position when resetting for export
* Add config context to avoid passing keys
* Add routine to fetch defaults; use method-specific enum (dae)
* Keep original position by default (dae)
* Fix code completion for forget dialog (dae)
Needs to be a symbolic link to the generated file
* Add zstd dep
* Implement backend backup with zstd
* Implement backup thinning
* Write backup meta
* Use new file ending anki21b
* Asynchronously backup on collection close in Rust
* Revert "Add zstd dep"
This reverts commit 3fcb2141d2be15f907269d13275c41971431385c.
* Add zstd again
* Take backup col path from col struct
* Fix formatting
* Implement backup restoring on backend
* Normalize restored media file names
* Refactor `extract_legacy_data()`
A bit cumbersome due to borrowing rules.
* Refactor
* Make thinning calendar-based and gradual
* Consider last kept backups of previous stages
* Import full apkgs and colpkgs with backend
* Expose new backup settings
* Test `BackupThinner` and make it deterministic
* Mark backup_path when closing optional
* Delete leaky timer
* Add progress updates for restoring media
* Write restored collection to tempfile first
* Do collection compression in the background thread
This has us currently storing an uncompressed and compressed copy of
the collection in memory (not ideal), but means the collection can be
closed without waiting for compression to complete. On a large collection,
this takes a close and reopen from about 0.55s to about 0.07s. The old
backup code for comparison: about 0.35s for compression off, about
8.5s for zip compression.
* Use multithreading in zstd compression
On my system, this reduces the compression time of a large collection
from about 0.55s to 0.08s.
* Stream compressed collection data into zip file
* Tweak backup explanation
+ Fix incorrect tab order for ignore accents option
* Decouple restoring backup and full import
In the first case, no profile is opened, unless the new collection
succeeds to load.
In the second case, either the old collection is reloaded or the new one
is loaded.
* Fix number gap in Progress message
* Don't revert backup when media fails but report it
* Tweak error flow
* Remove native BackupLimits enum
* Fix type annotation
* Add thinning test for whole year
* Satisfy linter
* Await async backup to finish
* Move restart disclaimer out of backup tab
Should be visible regardless of the current tab.
* Write restored collection in chunks
* Refactor
* Write media in chunks and refactor
* Log error if removing file fails
* join_backup_task -> await_backup_completion
* Refactor backup.rs
* Refactor backup meta and collection extraction
* Fix wrong error being returned
* Call sync_all() on new collection
* Add ImportError
* Store logger in Backend, instead of creating one on demand
init_backend() accepts a Logger rather than a log file, to allow other
callers to customize the logger if they wish.
In the future we may want to explore using the tracing crate as an
alternative; it's a bit more ergonomic, as a logger doesn't need to be
passed around, and it plays more nicely with async code.
* Sync file contents prior to rename; sync folder after rename.
* Limit backup creation to once per 30 min
* Use zstd::stream::copy_decode
* Make importing abortable
* Don't revert if backup media is aborted
* Set throttle implicitly
* Change force flag to minimum_backup_interval
* Don't attempt to open folders on Windows
* Join last backup thread before starting new one
Also refactor.
* Disable auto sync and backup when restoring again
* Force backup on full download
* Include the reason why a media file import failed, and the file path
- Introduce a FileIoError that contains a string representation of
the underlying I/O error, and an associated path. There are a few
places in the code where we're currently manually including the filename
in a custom error message, and this is a step towards a more consistent
approach (but we may be better served with a more general approach in
the future similar to Anyhow's .context())
- Move the error message into importing.ftl, as it's a bit neater
when error messages live in the same file as the rest of the messages
associated with some functionality.
* Fix importing of media files
* Minor wording tweaks
* Save an allocation
I18n strings with replacements are already strings, so we can skip the
extra allocation. Not that it matters here at all.
* Terminate import if file missing from archive
If a third-party tool is creating invalid archives, the user should know
about it. This should be rare, so I did not attempt to make it
translatable.
* Skip multithreaded compression on small collections
Co-authored-by: Damien Elmes <gpg@ankiweb.net>
Tokio has had to be pinned, because the 1.17 release introduces
a dependency on windows_sys, which fails to build on Windows on
Bazel.
The issue appears to be the build script of a subcrate - it is using
CARGO_MANIFEST_DIR to update the linking path so windows.lib can be
found (it's contained in that crate), but the path is set incorrectly.
dfc25285a2/crates/targets/x86_64_msvc/build.rs
One way we might be able to work around it is to add to the link path
in our own build script.
* Replace Card.data with .original_position
* Use and update original position in v3
* Show original position in card info
* Revert restoring original position for now
* Fix pb card to/from pylib card
* Try original_position as the last pb field
* minor wording tweaks (dae)
* Fix ValueError when exported files have wrong mtime
Set the `strict_timestamps` argument to `False`, so the media files which have a wrong mtime can be normally added to the zipfile.
* Update CONTRIBUTORS (ankitects#1666)
* Reformat exporting.py
IIRC, it was triggered when the user doesn't have a valid current
deck selected. currentRevLimit() sets default=False, but it is ignored
and the default deck is returned instead.
* Use submodule imports in aqt
* Use submodule imports in pylib
* More submodule imports in pylib
These required removing some direct imports to get rid of import cycles.
On a Linux machine here, the tests consistently fail when two copies
of black are run at once:
% bazel test //qt:format_check //pylib:format_check --cache_test_results=no
==================== Test output for //qt:format_check:
Process SyncManager-1:
Traceback (most recent call last):
File "/home/dae/.cache/bazel/_bazel_dae/fc22e40cbbf8b7d16ac57a00991b1ef1/external/python/lib/python3.9/multiprocessing/process.py", line 315, in _bootstrap
self.run()
File "/home/dae/.cache/bazel/_bazel_dae/fc22e40cbbf8b7d16ac57a00991b1ef1/external/python/lib/python3.9/multiprocessing/process.py", line 108, in run
self._target(*self._args, **self._kwargs)
File "/home/dae/.cache/bazel/_bazel_dae/fc22e40cbbf8b7d16ac57a00991b1ef1/external/python/lib/python3.9/multiprocessing/managers.py", line 583, in _run_server
server = cls._Server(registry, address, authkey, serializer)
File "/home/dae/.cache/bazel/_bazel_dae/fc22e40cbbf8b7d16ac57a00991b1ef1/external/python/lib/python3.9/multiprocessing/managers.py", line 156, in __init__
self.listener = Listener(address=address, backlog=16)
File "/home/dae/.cache/bazel/_bazel_dae/fc22e40cbbf8b7d16ac57a00991b1ef1/external/python/lib/python3.9/multiprocessing/connection.py", line 453, in __init__
self._listener = SocketListener(address, family, backlog)
File "/home/dae/.cache/bazel/_bazel_dae/fc22e40cbbf8b7d16ac57a00991b1ef1/external/python/lib/python3.9/multiprocessing/connection.py", line 596, in __init__
self._socket.bind(address)
OSError: [Errno 98] Address already in use
I dug briefly into Black's code, but suspect this is actually an issue
with the multiprocessing library. Didn't have time to investigate it
further; this workaround will do for now.
(One day I'll get around to merging those separate scripts into a single
one. One day. :-))
* avoid repinning Rust deps by default
* add id_tree dependency
* Respect intermediate child limits in v3
* Test new behaviour of v3 counts
* Rework v3 queue building to respect parent limits
* Add missing did field to SQL query
* Fix `LimitTreeMap::is_exhausted()`
* Rework tree building logic
https://github.com/ankitects/anki/pull/1638#discussion_r798328734
* Add timer for build_queues()
* `is_exhausted()` -> `limit_reached()`
* Move context and limits into `QueueBuilder`
This allows for moving more logic into QueueBuilder, so less passing
around of arguments. Unfortunately, some tests will require additional
work to set up.
* Fix stop condition in new_cards_by_position
* Fix order gather order of new cards by deck
* Add scheduler/queue/builder/burying.rs
* Fix bad tree due to unsorted child decks
* Fix comment
* Fix `cap_new_to_review_rec()`
* Add test for new card gathering
* Always sort `child_decks()`
* Fix deck removal in `cap_new_to_review_rec()`
* Fix sibling ordering in new card gathering
* Remove limits for deck total count with children
* Add random gather order
* Remove bad sibling order handling
All routines ensure ascending order now.
Also do some other minor refactoring.
* Remove queue truncating
All routines stop now as soon as the root limit is reached.
* Move deck fetching into `QueueBuilder::new()`
* Rework new card gather and sort options
https://github.com/ankitects/anki/pull/1638#issuecomment-1032173013
* Disable new sort order choices ...
depending on set gather order.
* Use enum instead of numbers
* Ensure valid sort order setting
* Update new gather and sort order tooltips
* Warn about random insertion order with v3
* Revert "Add timer for build_queues()"
This reverts commit c9f5fc6ebe87953c17a0c842990b009b5596c69c.
* Update rslib/src/storage/card/mod.rs (dae)
* minor wording tweaks to the tooltips (dae)
+ move legacy strings to bottom
+ consistent capitalization (our leech action still needs fixing,
but that will require introducing a new 'suspend card' string as the
existing one is used elsewhere as well)
The default symlink location can cause slowdowns and wasted CPU cycles
in VS Code and PyCharm/IntelliJ, as they try to watch Bazel's (large)
build folder for changes. The issue can be mostly ameliorated in VS Code
by excluding the symlinks using globs in settings like watcherExclude,
but the Rust extension doesn't support globs, so each folder needs to be
listed out separately. And because the product name symlink depends on
the name of the directory you're building from, we can't just include
the excludes in .vscode - it will depend on the folder the user is storing
things.
PyCharm and IntelliJ behave even worse here - they continue to monitor
for changes in all folders of the repo, even if those folders have been
marked as excluded in the project settings. Placing the folders into the
IDE-global Editor>File Types>Ignored Files And Folders works around this,
but again we run into troubles making this work out of the box, especially
with the product name in the symlink.
One option would be to turn the symlinks off completely. They are not
required for building, and for scripting/debugging, we can get the folder
locations via 'bazel info'. But with that approach, we would no longer
be able to symlink build products into the source tree, as we do for
things like the generated backend methods and translations, so we'd lose
code completion for them that way.
Another option would be to place the symlinks in .bazel/ inside the repo.
That solves the VS Code case (in conjunction with a workspace config file),
but doesn't fully fix IntelliJ/PyCharm.
The only remaining option I can see is to place the symlinks outside the
repo. Bazel won't expand ~ in the symlink path, so we can't use something
like ~/.cache/bazel/anki to place the files near the other build files.
So we end up having to have the files written to ../bazel/anki, in the
repo's parent folder. Not very clean, but I don't see a better alternative
at the moment.
.gitignore is still ignoring bazel-*, as currently bazel-dist and
bazel-pkg will be created when building/packaging. They should be fairly
innocuous, but we may want to rename them at one point.
Other changes:
- add missing symlink for pylib hooks
- add a sample .user.bazelrc file
* Add _bytes methods for all methods in the backend
Expose get_note in qt/aqt/mediasrv.py
* Satisfy formatter
* Rename _bytes function to _raw and have them bytes as input
* Fix backend generation
* Use lib/proto/deckOptions in deck-options
* Add exposed_backend to qt/aqt/mediasrv.py
* Move some more backend methods to exposed_backend_list
* Use protobufjs for congrats and i18n
* Use protobufjs for completeTag
* Use protobufjs services in change-notetype
* Reorder post handlers in alphabetical manner
* Satisfy tests
* Remove unused collection methods
* Rename access_backend to raw_backend_request
* Use _vendor.stringcase instead of creating a new function
* Remove SKIP_UNROLL_OUTPUT
* Directly call _run_command in non _raw methods
* Remove TranslateString, ChangeNotetype and CompleteTag from SKIP_UNROLL_INPUT
* Remove UpdateDeckConfigs from SKIP_UNROLL_INPUT
* Remove ChangeNotetype from SKIP_UNROLL_INPUT
* Remove SKIP_UNROLL_INPUT
* Fix typing issue with translate_string
- Adds typing support for Protobuf maps in genbackend.py
* Do not emit convenience method for protobuf TranslateString
* Implement custom study on backend
* Switch frontend to backend custom study
* Skip typecheck for new pb classes
* Build tag search string on backend
Also fixes escaping of special characters in tag names.
* `cram.cards` -> `cram.card_limit`
* Assign more meaningful names in `TagLimit`
* Broaden rustfmt glob
* Use `invalid_input()` helper
* Assign `FilteredDeckForUpdate` to temp var
* Implement `SearchBuilder`
* Rewrite `custom_study()` with `SearchBuilder`
* Replace match macros with `SearchBuilder`
* Remove `into_nodes_list` & `concatenate_searches`
- The way mypy gathers site packages has changed slightly, so we had to
update extendsitepkgs.py to work with it.
- Not sure if there's a way to avoid the ignore in
operations/__init__.py. mypy is still ensuring a provided argument has
a .changes attribute, so thankfully we don't seem to have lost much here.
* Make hard repeat the current step's interval in v3
Unless for the first step to avoid identical interval with Again.
* Make Hard repeat the current step's interval in v2
* Adjust test to new Hard behaviour
* Fix steps being mistaken for seconds
* Cap steps at `u32::max` seconds
* Fix overflow of steps in Rust
* Prevent overflow of `IntervalKind`
* Prevent overflow in `revlod/mod.rs`
Also replace some `as` with `from` and `try_from` as is recommended to
highlight potential issues.
* Ensure v2 doesn't store overflowing revlog ivls
* Lower steps cap in deck options
Whereas large card intervals are converted to days, revlog intervals use
i32s to store large numbers of seconds.
* Format
* The importer list have a Hook
Previously, add-on 175027074 simply edited the list once. It became impossible
since the list became a function. Hence I need a filter to add the list here.
@kelciour (nice to meet you by the way), you may be interested by it too (at
least if I believe efb1ce46d4 )
I would have preferred to use `anki.importing.base.Importer` instead of
`Any`. However, this leads to
> Name "anki.importing.base.Importer" is not defined [name-defined]
when I run test.
Helps to solve this would be welcomed
* mention the hook may not last too long (dae)
* Add py3.9 to hooks
This follows examples from efb1ce46d4 I assume the
hooks were missed because those were not considered types but strings.
I did not even try to run pyupgrade and did the change manually, then used bazel format
* remove wildcard import in find.py, and change Any to object (dae)
Bytes come in as an array of integers, which we could probably solve on
the PyO3 end instead, but this will do for now - the same fix is already
used for non-DB case.
https://forums.ankiweb.net/t/anki-2-1-50-beta/15608/15