Commit Graph

28 Commits

Author SHA1 Message Date
RumovZ
5f9451f547
Add apkg import/export on backend (#1743)
* Add apkg export on backend

* Filter out missing media-paths at write time

* Make TagMatcher::new() infallible

* Gather export data instead of copying directly

* Revert changes to rslib/src/tags/

* Reuse filename_is_safe/check_filename_safe()

* Accept func to produce MediaIter in export_apkg()

* Only store file folder once in MediaIter

* Use temporary tables for gathering

export_apkg() now accepts a search instead of a deck id. Decks are
gathered according to the matched notes' cards.

* Use schedule_as_new() to reset cards

* ExportData → ExchangeData

* Ignore ascii case when filtering system tags

* search_notes_cards_into_table →

search_cards_of_notes_into_table

* Start on apkg importing on backend

* Fix due dates in days for apkg export

* Refactor import-export/package

- Move media and meta code into appropriate modules.
- Normalize/check for normalization when deserializing media entries.

* Add SafeMediaEntry for deserialized MediaEntries

* Prepare media based on checksums

- Ensure all existing media files are hashed.
- Hash incoming files during preparation to detect conflicts.
- Uniquify names of conflicting files with hash (not notetype id).
- Mark media files as used while importing notes.
- Finally copy used media.

* Handle encoding in `replace_media_refs()`

* Add trait to keep down cow boilerplate

* Add notetypes immediately instaed of preparing

* Move target_col into Context

* Add notes immediately instaed of preparing

* Note id, not guid of conflicting notes

* Add import_decks()

* decks_configs → deck_configs

* Add import_deck_configs()

* Add import_cards(), import_revlog()

* Use dyn instead of generic for media_fn

Otherwise, would have to pass None with type annotation in the default
case.

* Fix signature of import_apkg()

* Fix search_cards_of_notes_into_table()

* Test new functions in text.rs

* Add roundtrip test for apkg (stub)

* Keep source id of imported cards (or skip)

* Keep source ids of imported revlog (or skip)

* Try to keep source ids of imported notes

* Make adding notetype with id undoable

* Wrap apkg import in transaction

* Keep source ids of imported deck configs (or skip)

* Handle card due dates and original due/did

* Fix importing cards/revlog

Card ids are manually uniquified.

* Factor out card importing

* Refactor card and revlog importing

* Factor out card importing

Also handle missing parents .

* Factor out note importing

* Factor out media importing

* Maybe upgrade scheduler of apkg

* Fix parent deck gathering

* Unconditionally import static media

* Fix deck importing edge cases

Test those edge cases, and add some global test helpers.

* Test note importing

* Let import_apkg() take a progress func

* Expand roundtrip apkg test

* Use fat pointer to avoid propogating generics

* Fix progress_fn type

* Expose apkg export/import on backend

* Return note log when importing apkg

* Fix archived collection name on apkg import

* Add CollectionOpWithBackendProgress

* Fix wrong Interrupted Exception being checked

* Add ClosedCollectionOp

* Add note ids to log and strip HTML

* Update progress when checking incoming media too

* Conditionally enable new importing in GUI

* Fix all_checksums() for media import

Entries of deleted files are nulled, not removed.

* Make apkg exporting on backend abortable

* Return number of notes imported from apkg

* Fix exception printing for QueryOp as well

* Add QueryOpWithBackendProgress

Also support backend exporting progress.

* Expose new apkg and colpkg exporting

* Open transaction in insert_data()

Was slowing down exporting by several orders of magnitude.

* Handle zstd-compressed apkg

* Add legacy arg to ExportAnkiPackage

Currently not exposed on the frontend

* Remove unused import in proto file

* Add symlink for typechecking of import_export_pb2

* Avoid kwargs in pb message creation, so typechecking is not lost

Protobuf's behaviour is rather subtle and I had to dig through the docs
to figure it out: set a field on a submessage to automatically assign 
the submessage to the parent, or call SetInParent() to persist a default
version of the field you specified.

* Avoid re-exporting protobuf msgs we only use internally

* Stop after one test failure

mypy often fails much faster than pylint

* Avoid an extra allocation when extracting media checksums

* Update progress after prepare_media() finishes

Otherwise the bulk of the import ends up being shown as "Checked: 0"
in the progress window.

* Show progress of note imports

Note import is the slowest part, so showing progress here makes the UI
feel more responsive.

* Reset filtered decks at import time

Before this change, filtered decks exported with scheduling remained
filtered on import, and maybe_remove_from_filtered_deck() moved cards
into them as their home deck, leading to errors during review.

We may still want to provide a way to preserve filtered decks on import,
but to do that we'll need to ensure we don't rewrite the home decks of
cards, and we'll need to ensure the home decks are included as part of
the import (or give an error if they're not).

https://github.com/ankitects/anki/pull/1743/files#r839346423

* Fix a corner-case where due dates were shifted by a day

This issue existed in the old Python code as well. We need to include
the user's UTC offset in the exported file, or days_elapsed falls back
on the v1 cutoff calculation, which may be a day earlier or later than
the v2 calculation.

* Log conflicting note in remapped nt case

* take_fields() → into_fields()

* Alias `[u8; 20]` with `Sha1Hash`

* Truncate logged fields

* Rework apkg note import tests

- Use macros for more helpful errors.
- Split monolith into unit tests.
- Fix some unknown error with the previous test along the way.
(Was failing after 969484de4388d225c9f17d94534b3ba0094c3568.)

* Fix sorting of imported decks

Also adjust the test, so it fails without the patch. It was only passing
before, because the parent deck happened to come before the
inconsistently capitalised child alphabetically. But we want all parent
decks to be imported before their child decks, so their children can
adopt their capitalisation.

* target[_id]s → existing_card[_id]s

* export_collection_extracting_media() → ...

export_into_collection_file()

* target_already_exists→card_ordinal_already_exists

* Add search_cards_of_notes_into_table.sql

* Imrove type of apkg export selector/limit

* Remove redundant call to mod_schema()

* Parent tooltips to mw

* Fix a crash when truncating note text

String::truncate() is a bit of a footgun, and I've hit this before
too :-)

* Remove ExportLimit in favour of separate classes

* Remove OpWithBackendProgress and ClosedCollectionOp

Backend progress logic is now in ProgressManager. QueryOp can be used
for running on closed collection.

Also fix aborting of colpkg exports, which slipped through in #1817.

* Tidy up import log

* Avoid QDialog.exec()

* Default to excluding scheuling for deck list deck

* Use IncrementalProgress in whole import_export code

* Compare checksums when importing colpkgs

* Avoid registering changes if hashes are not needed

* ImportProgress::Collection → ImportProgress::File

* Make downgrading apkgs depend on meta version

* Generalise IncrementableProgress

And use it in entire import_export code instead.

* Fix type complexity lint

* Take count_map for IncrementableProgress::get_inner

* Replace import/export env with Shift click

* Accept all args from update() for backend progress

* Pass fields of ProgressUpdate explicitly

* Move update_interval into IncrementableProgress

* Outsource incrementing into Incrementor

* Mutate ProgressUpdate in progress_update callback

* Switch import/export legacy toggle to profile setting

Shift would have been nice, but the existing shortcuts complicate things.
If the user triggers an import with ctrl+shift+i, shift is unlikely to
have been released by the time our code runs, meaning the user accidentally
triggers the new code. We could potentially wait a while before bringing
up the dialog, but then we're forced to guess at how long it will take the
user to release the key.

One alternative would be to use alt instead of shift, but then we need to
trigger our shortcut when that key is pressed as well, and it could
potentially cause a conflict with an add-on that already uses that
combination.

* Show extension in export dialog

* Continue to provide separate options for schema 11+18 colpkg export

* Default to colpkg export when using File>Export

* Improve appearance of combo boxes when switching between apkg/colpkg

+ Deal with long deck names

* Convert newlines to spaces when showing fields from import

Ensures each imported note appears on a separate line

* Don't separate total note count from the other summary lines

This may come down to personal preference, but I feel the other counts
are equally as important, and separating them feels like it makes it
a bit easier to ignore them.

* Fix 'deck not normal' error when importing a filtered deck for the 2nd time

* Fix [Identical] being shown on first import

* Revert "Continue to provide separate options for schema 11+18 colpkg export"

This reverts commit 8f0b2c175f4794d642823b60414d142a12768441.

Will use a different approach

* Move legacy support into a separate exporter option; add to apkg export

* Adjust 'too new' message to also apply to .apkg import case

* Show a better message when attempting to import new apkg into old code

Previously the user could end seeing a message like:

UnicodeDecodeError: 'utf-8' codec can't decode byte 0xb5 in position 1: invalid start byte

Unfortunately we can't retroactively fix this for older clients.

* Hide legacy support option in older exporting screen

* Reflect change from paths to fnames in type & name

* Make imported decks normal at once

Then skip special casing in update_deck(). Also skip updating
description if new one is empty.

Co-authored-by: Damien Elmes <gpg@ankiweb.net>
2022-05-02 21:12:46 +10:00
Damien Elmes
72c7d64876 Fix compatibility with older macOS versions
https://forums.ankiweb.net/t/anki-2-1-50-qt5-wont-open/19091
2022-04-15 17:35:05 +10:00
Damien Elmes
95dbf30fb9 updates to the build process and binary bundles
All platforms:

- rename scripts/ to tools/: Bazelisk expects to find its wrapper script
(used by the Mac changes below) in tools/. Rather than have a separate
scripts/ and tools/, it's simpler to just move everything into tools/.
- wheel outputs and binary bundles now go into .bazel/out/dist. While
not technically Bazel build products, doing it this way ensures they get
cleaned up when 'bazel clean' is run, and it keeps them out of the source
folder.
- update to the latest Bazel

Windows changes:

- bazel.bat has been removed, and tools\setup-env.bat has been added.
Other scripts like .\run.bat will automatically call it to set up the
environment.
- because Bazel is now on the path, you can 'bazel test ...' from any
folder, instead of having to do \anki\bazel.
- the bat files can handle being called from any working directory,
so things like running "\anki\tools\python" from c:\ will work.
- build installer as part of bundling process

Mac changes:

- `arch -arch x86_64 bazel ...` will now automatically use a different
build root, so that it is cheap to switch back and forth between archs
on a new Mac.
- tools/run-qt* will now automatically use Rosetta
- disable jemalloc in Mac x86 build for now, as it won't build under
Rosetta (perhaps due to its build scripts using $host_cpu instead of
$target_cpu)
- create app bundle as part of bundling process

Linux changes:

- remove arm64 orjson workaround in Linux bundle, as without a
readily-available, relatively distro-agonstic PyQt/Qt build
we can use, the arm64 Linux bundle is of very limited usefulness.
- update Docker files for release build
- include fcitx5 in both the qt5 and qt6 bundles
- create tarballs as part of the bundling process
2022-02-10 19:23:07 +10:00
Damien Elmes
f842ab7c9d switch convenience symlinks to .bazel/
Unfortunately 5efaf5a4be broke the Svelte
language tools - presumably having paths outside of the repo is confusing
them.

As a plan B, the symlinks have been shifted to a single subdir. Along
with some exclusions in the VS Code config, this should allow VS Code
to continue to work out of the box, but the docs will need updating
to reflect the extra work required for PyCharm/IntelliJ.

+ fix svelte-check execution on a system without node installed. It
still throws up some errors that are presumably caused by our multiple
rootDirs - not sure if there's an easy way to work around that.
2022-01-24 11:06:02 +10:00
Damien Elmes
5efaf5a4be move Bazel convenience symlinks outside of repo folder
The default symlink location can cause slowdowns and wasted CPU cycles
in VS Code and PyCharm/IntelliJ, as they try to watch Bazel's (large)
build folder for changes. The issue can be mostly ameliorated in VS Code
by excluding the symlinks using globs in settings like watcherExclude,
but the Rust extension doesn't support globs, so each folder needs to be
listed out separately. And because the product name symlink depends on
the name of the directory you're building from, we can't just include
the excludes in .vscode - it will depend on the folder the user is storing
things.

PyCharm and IntelliJ behave even worse here - they continue to monitor
for changes in all folders of the repo, even if those folders have been
marked as excluded in the project settings. Placing the folders into the
IDE-global Editor>File Types>Ignored Files And Folders works around this,
but again we run into troubles making this work out of the box, especially
with the product name in the symlink.

One option would be to turn the symlinks off completely. They are not
required for building, and for scripting/debugging, we can get the folder
locations via 'bazel info'. But with that approach, we would no longer
be able to symlink build products into the source tree, as we do for
things like the generated backend methods and translations, so we'd lose
code completion for them that way.

Another option would be to place the symlinks in .bazel/ inside the repo.
That solves the VS Code case (in conjunction with a workspace config file),
but doesn't fully fix IntelliJ/PyCharm.

The only remaining option I can see is to place the symlinks outside the
repo. Bazel won't expand ~ in the symlink path, so we can't use something
like ~/.cache/bazel/anki to place the files near the other build files.
So we end up having to have the files written to ../bazel/anki, in the
repo's parent folder. Not very clean, but I don't see a better alternative
at the moment.

.gitignore is still ignoring bazel-*, as currently bazel-dist and
bazel-pkg will be created when building/packaging. They should be fairly
innocuous, but we may want to rename them at one point.

Other changes:

- add missing symlink for pylib hooks
- add a sample .user.bazelrc file
2022-01-23 19:18:44 +10:00
Damien Elmes
67ee6f9c0e update to Rust 1.57 + latest rules_rust 2021-12-03 20:35:52 +10:00
Damien Elmes
bdbcb6d7aa default to a vendored copy of Python
Brings Python in line with our other dependencies, and means users
no longer need to install it prior to building, or deal with
issues caused by having the wrong version available.
2021-10-15 22:14:05 +10:00
Damien Elmes
e9c7b2287f bump minimum Python to 3.9 2021-10-04 15:05:15 +10:00
Damien Elmes
944b064e54 update Rust deps 2021-10-02 20:42:03 +10:00
Damien Elmes
d73852f272 use separate integration test for links
If we run into issues with unreliable network connections in the future,
we'll be able to mark the test as flaky so Bazel can retry it multiple
times.
2021-07-24 10:12:25 +10:00
Damien Elmes
b1dedb1b1f disable link check outside CI 2021-07-23 20:22:32 +10:00
Damien Elmes
d3062ceda4 fix release builds 2021-06-24 15:01:32 +10:00
Damien Elmes
1cc63f9267 update to latest rules_rust incremental compilation 2021-04-09 12:48:24 +10:00
Damien Elmes
6f7a4bf29e update rules_rust with worker refactor
If you were using the optional Rust worker support, please see the
change to development.md
2021-03-30 17:24:51 +10:00
Damien Elmes
89d249b3b6 update to the latest rules_rust + security framework update 2021-03-27 19:28:19 +10:00
Damien Elmes
dbcfaadd79 we no longer need to force CI to local 2021-03-21 00:25:59 +10:00
Damien Elmes
3c1fa68460 turn top bar dark when night mode enabled on macOS 2021-02-04 19:19:56 +10:00
Damien Elmes
ed7e709285 update pinned Bazel version to 4.0 release 2021-01-21 19:58:41 +10:00
Damien Elmes
68e3d55c17 typo 2020-12-29 15:26:06 +10:00
Damien Elmes
e948544b59 use local strategy for Svelte on CI
Allows some type errors to surface that were only being picked up
on Windows.

The root cause seems to be TypeScript picking up other .d.ts/.tsx
files in the same folder, which it can only do on Windows due to the
lack of sandboxing. On other platforms the other files can't be found,
and tsc changes the types into 'any'.

I experimented with modifying rules_svelte to build all .tsx files up
front and convert them to .d.ts in bulk, but ran into further issues
with conflicting types, as the typings in svelte2tsx seem to conflict
with Svelte's built-in types, and passing the dependencies in explicitly
causes them to be checked even though --skipLibCheck is passed in to
TypeScript.

Forcing sandboxing off is an ugly hack, and our best approach moving
forward may be to switch to ts_project for the Svelte generation -
it does appear that rules_nodejs favours it over ts_library anyway.
2020-12-29 14:50:33 +10:00
Damien Elmes
355e4cd519 use PYTHON_SYS_EXECUTABLE for setting path to Python 2020-12-23 21:53:13 +10:00
Damien Elmes
b80f33d14d document worker and disable it by default 2020-12-11 21:04:06 +10:00
Damien Elmes
fb680aa77e update rules_rust + persistent_worker 2020-12-10 15:35:37 +10:00
Damien Elmes
b55fd93cab manual tag on rslib was preventing clippy lints 2020-11-24 20:10:16 +10:00
Damien Elmes
e5059a5dff update Windows CI 2020-11-09 19:09:23 +10:00
Damien Elmes
f93f01b3e9 support user bazelrc 2020-11-04 22:11:28 +10:00
Damien Elmes
c179a2b45b extract version from defs.bzl; gate buildhash on optimized build 2020-11-04 14:02:08 +10:00
Damien Elmes
aea0a6fcc6 initial Bazel conversion
Running and testing should be working on the three platforms, but
there's still a fair bit that needs to be done:

- Wheel building + testing in a venv still needs to be implemented.
- Python requirements still need to be compiled with piptool and pinned;
need to compile on all platforms then merge
- Cargo deps in cargo/ and rslib/ need to be cleaned up, and ideally
unified into one place
- Currently using rustls to work around openssl compilation issues
on Linux, but this will break corporate proxies with custom SSL
authorities; need to conditionally use openssl or use
https://github.com/seanmonstar/reqwest/pull/1058
- Makefiles and docs still need cleaning up
- It may make sense to reparent ts/* to the top level, as we don't
nest the other modules under a specific language.
- rspy and pylib must always be updated in lock-step, so merging
rspy into pylib as a private module would simplify things.
- Merging desktop-ftl and mobile-ftl into the core ftl would make
managing and updating translations easier.
- Obsolete scripts need removing.
- And probably more.
2020-11-01 14:26:58 +10:00