(for upgrading users, please see the notes at the bottom)
Bazel brought a lot of nice things to the table, such as rebuilds based on
content changes instead of modification times, caching of build products,
detection of incorrect build rules via a sandbox, and so on. Rewriting the build
in Bazel was also an opportunity to improve on the Makefile-based build we had
prior, which was pretty poor: most dependencies were external or not pinned, and
the build graph was poorly defined and mostly serialized. It was not uncommon
for fresh checkouts to fail due to floating dependencies, or for things to break
when trying to switch to an older commit.
For day-to-day development, I think Bazel served us reasonably well - we could
generally switch between branches while being confident that builds would be
correct and reasonably fast, and not require full rebuilds (except on Windows,
where the lack of a sandbox and the TS rules would cause build breakages when TS
files were renamed/removed).
Bazel achieves that reliability by defining rules for each programming language
that define how source files should be turned into outputs. For the rules to
work with Bazel's sandboxing approach, they often have to reimplement or
partially bypass the standard tools that each programming language provides. The
Rust rules call Rust's compiler directly for example, instead of using Cargo,
and the Python rules extract each PyPi package into a separate folder that gets
added to sys.path.
These separate language rules allow proper declaration of inputs and outputs,
and offer some advantages such as caching of build products and fine-grained
dependency installation. But they also bring some downsides:
- The rules don't always support use-cases/platforms that the standard language
tools do, meaning they need to be patched to be used. I've had to contribute a
number of patches to the Rust, Python and JS rules to unblock various issues.
- The dependencies we use with each language sometimes make assumptions that do
not hold in Bazel, meaning they either need to be pinned or patched, or the
language rules need to be adjusted to accommodate them.
I was hopeful that after the initial setup work, things would be relatively
smooth-sailing. Unfortunately, that has not proved to be the case. Things
frequently broke when dependencies or the language rules were updated, and I
began to get frustrated at the amount of Anki development time I was instead
spending on build system upkeep. It's now about 2 years since switching to
Bazel, and I think it's time to cut losses, and switch to something else that's
a better fit.
The new build system is based on a small build tool called Ninja, and some
custom Rust code in build/. This means that to build Anki, Bazel is no longer
required, but Ninja and Rust need to be installed on your system. Python and
Node toolchains are automatically downloaded like in Bazel.
This new build system should result in faster builds in some cases:
- Because we're using cargo to build now, Rust builds are able to take advantage
of pipelining and incremental debug builds, which we didn't have with Bazel.
It's also easier to override the default linker on Linux/macOS, which can
further improve speeds.
- External Rust crates are now built with opt=1, which improves performance
of debug builds.
- Esbuild is now used to transpile TypeScript, instead of invoking the TypeScript
compiler. This results in faster builds, by deferring typechecking to test/check
time, and by allowing more work to happen in parallel.
As an example of the differences, when testing with the mold linker on Linux,
adding a new message to tags.proto (which triggers a recompile of the bulk of
the Rust and TypeScript code) results in a compile that goes from about 22s on
Bazel to about 7s in the new system. With the standard linker, it's about 9s.
Some other changes of note:
- Our Rust workspace now uses cargo-hakari to ensure all packages agree on
available features, preventing unnecessary rebuilds.
- pylib/anki is now a PEP420 implicit namespace, avoiding the need to merge
source files and generated files into a single folder for running. By telling
VSCode about the extra search path, code completion now works with generated
files without needing to symlink them into the source folder.
- qt/aqt can't use PEP420 as it's difficult to get rid of aqt/__init__.py.
Instead, the generated files are now placed in a separate _aqt package that's
added to the path.
- ts/lib is now exposed as @tslib, so the source code and generated code can be
provided under the same namespace without a merging step.
- MyPy and PyLint are now invoked once for the entire codebase.
- dprint will be used to format TypeScript/json files in the future instead of
the slower prettier (currently turned off to avoid causing conflicts). It can
automatically defer to prettier when formatting Svelte files.
- svelte-check is now used for typechecking our Svelte code, which revealed a
few typing issues that went undetected with the old system.
- The Jest unit tests now work on Windows as well.
If you're upgrading from Bazel, updated usage instructions are in docs/development.md and docs/build.md. A summary of the changes:
- please remove node_modules and .bazel
- install rustup (https://rustup.rs/)
- install rsync if not already installed (on windows, use pacman - see docs/windows.md)
- install Ninja (unzip from https://github.com/ninja-build/ninja/releases/tag/v1.11.1 and
place on your path, or from your distro/homebrew if it's 1.10+)
- update .vscode/settings.json from .vscode.dist
* Expose cloze text as attribute on front side
* Update test_models.py
* Update template_filters.rs
* Escape HTML for data-attribute
* Use minimal HTML encoding in Rust
to match Python's html.escape and pass tests.
* Rename attribute to data-cloze
to make it more generic.
* Run formatter
* Revert to using Rust encode_attribute and add helper function for tests
* Skip unparsable pylib files with black
This was causing black to return a non-zero code, preventing the
subsequent isort from running.
* Organise imports on save
On a Linux machine here, the tests consistently fail when two copies
of black are run at once:
% bazel test //qt:format_check //pylib:format_check --cache_test_results=no
==================== Test output for //qt:format_check:
Process SyncManager-1:
Traceback (most recent call last):
File "/home/dae/.cache/bazel/_bazel_dae/fc22e40cbbf8b7d16ac57a00991b1ef1/external/python/lib/python3.9/multiprocessing/process.py", line 315, in _bootstrap
self.run()
File "/home/dae/.cache/bazel/_bazel_dae/fc22e40cbbf8b7d16ac57a00991b1ef1/external/python/lib/python3.9/multiprocessing/process.py", line 108, in run
self._target(*self._args, **self._kwargs)
File "/home/dae/.cache/bazel/_bazel_dae/fc22e40cbbf8b7d16ac57a00991b1ef1/external/python/lib/python3.9/multiprocessing/managers.py", line 583, in _run_server
server = cls._Server(registry, address, authkey, serializer)
File "/home/dae/.cache/bazel/_bazel_dae/fc22e40cbbf8b7d16ac57a00991b1ef1/external/python/lib/python3.9/multiprocessing/managers.py", line 156, in __init__
self.listener = Listener(address=address, backlog=16)
File "/home/dae/.cache/bazel/_bazel_dae/fc22e40cbbf8b7d16ac57a00991b1ef1/external/python/lib/python3.9/multiprocessing/connection.py", line 453, in __init__
self._listener = SocketListener(address, family, backlog)
File "/home/dae/.cache/bazel/_bazel_dae/fc22e40cbbf8b7d16ac57a00991b1ef1/external/python/lib/python3.9/multiprocessing/connection.py", line 596, in __init__
self._socket.bind(address)
OSError: [Errno 98] Address already in use
I dug briefly into Black's code, but suspect this is actually an issue
with the multiprocessing library. Didn't have time to investigate it
further; this workaround will do for now.
(One day I'll get around to merging those separate scripts into a single
one. One day. :-))
* avoid repinning Rust deps by default
* add id_tree dependency
* Respect intermediate child limits in v3
* Test new behaviour of v3 counts
* Rework v3 queue building to respect parent limits
* Add missing did field to SQL query
* Fix `LimitTreeMap::is_exhausted()`
* Rework tree building logic
https://github.com/ankitects/anki/pull/1638#discussion_r798328734
* Add timer for build_queues()
* `is_exhausted()` -> `limit_reached()`
* Move context and limits into `QueueBuilder`
This allows for moving more logic into QueueBuilder, so less passing
around of arguments. Unfortunately, some tests will require additional
work to set up.
* Fix stop condition in new_cards_by_position
* Fix order gather order of new cards by deck
* Add scheduler/queue/builder/burying.rs
* Fix bad tree due to unsorted child decks
* Fix comment
* Fix `cap_new_to_review_rec()`
* Add test for new card gathering
* Always sort `child_decks()`
* Fix deck removal in `cap_new_to_review_rec()`
* Fix sibling ordering in new card gathering
* Remove limits for deck total count with children
* Add random gather order
* Remove bad sibling order handling
All routines ensure ascending order now.
Also do some other minor refactoring.
* Remove queue truncating
All routines stop now as soon as the root limit is reached.
* Move deck fetching into `QueueBuilder::new()`
* Rework new card gather and sort options
https://github.com/ankitects/anki/pull/1638#issuecomment-1032173013
* Disable new sort order choices ...
depending on set gather order.
* Use enum instead of numbers
* Ensure valid sort order setting
* Update new gather and sort order tooltips
* Warn about random insertion order with v3
* Revert "Add timer for build_queues()"
This reverts commit c9f5fc6ebe87953c17a0c842990b009b5596c69c.
* Update rslib/src/storage/card/mod.rs (dae)
* minor wording tweaks to the tooltips (dae)
+ move legacy strings to bottom
+ consistent capitalization (our leech action still needs fixing,
but that will require introducing a new 'suspend card' string as the
existing one is used elsewhere as well)
* Add _bytes methods for all methods in the backend
Expose get_note in qt/aqt/mediasrv.py
* Satisfy formatter
* Rename _bytes function to _raw and have them bytes as input
* Fix backend generation
* Use lib/proto/deckOptions in deck-options
* Add exposed_backend to qt/aqt/mediasrv.py
* Move some more backend methods to exposed_backend_list
* Use protobufjs for congrats and i18n
* Use protobufjs for completeTag
* Use protobufjs services in change-notetype
* Reorder post handlers in alphabetical manner
* Satisfy tests
* Remove unused collection methods
* Rename access_backend to raw_backend_request
* Use _vendor.stringcase instead of creating a new function
* Remove SKIP_UNROLL_OUTPUT
* Directly call _run_command in non _raw methods
* Remove TranslateString, ChangeNotetype and CompleteTag from SKIP_UNROLL_INPUT
* Remove UpdateDeckConfigs from SKIP_UNROLL_INPUT
* Remove ChangeNotetype from SKIP_UNROLL_INPUT
* Remove SKIP_UNROLL_INPUT
* Fix typing issue with translate_string
- Adds typing support for Protobuf maps in genbackend.py
* Do not emit convenience method for protobuf TranslateString
* Make hard repeat the current step's interval in v3
Unless for the first step to avoid identical interval with Again.
* Make Hard repeat the current step's interval in v2
* Adjust test to new Hard behaviour
* Remove flooring in v3 scheduler code
It is no longer supposed to be an exact port of the old Python code.
* Rework v3 fuzzing
https://github.com/ankitects/anki/issues/1416#issuecomment-958208149
* Ensure length of fuzz range is larger than 1
Only for new intervals larger than 1 and respecting max review interval.
* add the beginnings of a unit test
* Clarify `fuzz_factor` doc string
* Fix Python tests for 2021 scheduler
* Fix fuzz test
1.0 is not a valid fuzz factor.
* Add tests for fuzzing in Rust
* Use range notation in fuzz factor doc
* Strip redundant tests
* PEP8 dbproxy.py
* PEP8 errors.py
* PEP8 httpclient.py
* PEP8 lang.py
* PEP8 latex.py
* Add decorator to deprectate key words
* Make replacement for deprecated attribute optional
* Use new helper `_print_replacement_warning()`
* PEP8 media.py
* PEP8 rsbackend.py
* PEP8 sound.py
* PEP8 stdmodels.py
* PEP8 storage.py
* PEP8 sync.py
* PEP8 tags.py
* PEP8 template.py
* PEP8 types.py
* Fix DeprecatedNamesMixinForModule
The class methods need to be overridden with instance methods, so every
module has its own dicts.
* Use `# pylint: disable=invalid-name` instead of id
* PEP8 utils.py
* Only decorate `__getattr__` with `@no_type_check`
* Fix mypy issue with snakecase
Importing it from `anki._vendor` raises attribute errors.
* Format
* Remove inheritance of DeprecatedNamesMixin
There's almost no shared code now and overriding classmethods with
instance methods raises mypy issues.
* Fix traceback frames of deprecation warnings
* remove fn/TimedLog (dae)
Neither Anki nor add-ons appear to have been using it
* fix some issues with stringcase use (dae)
- the wheel was depending on the PyPI version instead of our vendored
version
- _vendor:stringcase should not have been listed in the anki py_library.
We already include the sources in py_srcs, and need to refer to them
directly. By listing _vendor:stringcase as well, we were making a
top-level stringcase library available, which would have only worked for
distributing because the wheel definition was also incorrect.
- mypy errors are what caused me to mistakenly add the above - they
were because the type: ignore at the top of stringcase.py was causing
mypy to completely ignore the file, so it was not aware of any attributes
it contained.
* Only collect card stats on the backend ...
... instead of rendering an HTML string using askama.
* Add ts page Card Info
* Update test for new `col.card_stats()`
* Remove obsolete CardStats code
* Use new ts page in `CardInfoDialog`
* Align start and end instead of left and right
Curiously, `text-align: start` does not work for `th` tags if assigned
via classes.
* Adopt ts refactorings after rebase
#1405 and #1409
* Clean up `ts/card-info/BUILD.bazel`
* Port card info logic from Rust to TS
* Move repeated field to the top
https://github.com/ankitects/anki/pull/1414#discussion_r725402730
* Convert pseudo classes to interfaces
* CardInfoPage -> CardInfo
* Make revlog in card info optional
* Add legacy support for old card stats
* Check for undefined instead of falsy
* Make Revlog separate component
* drop askama dependency (dae)
* Fix nightmode for legacy card stats
This adds Python 3.9 and 3.10 typing syntax to files that import
attributions from __future___. Python 3.9 should be able to cope with
the 3.10 syntax, but Python 3.8 will no longer work.
On Windows/Mac, install the latest Python 3.9 version from python.org.
There are currently no orjson wheels for Python 3.10 on Windows/Mac,
which will break the build unless you have Rust installed separately.
On Linux, modern distros should have Python 3.9 available already. If
you're on an older distro, you'll need to build Python from source first.
Python's regex engine performs pathologically on regexes like
'<!--.*?-->' when fed a large string of repeating '<!--' clauses.
Thanks to JaimeSlome / security@huntr.dev for the report; closes#1380.
Solved by switching to the Rust implementation, which does not suffer
from this issue.
entsToText(), minimizeHTML(), and the old regex constants have been
removed; they do not appear to be used by any add-ons.
Interday learning cards are now counted in the learning count again,
and are no longer subject to the daily review limit.
The thinking behind the original change was that interday learning cards
are scheduled more like reviews, and counting them in the review count
would allow the learning count to focus on intraday learning - the red
number reflecting the fact that they are the most fragile memories. And
counting them together made it practical to apply the review limit
to both at once.
Since the release, there have been a number of users expecting to see
interday learning cards included in the learning count (the latest being
https://forums.ankiweb.net/t/feedback-and-a-feature-adjustment-request-for-2-1-45/12308),
and a good argument can be made for that too - they are, after all, listed
in the learning steps, and do tend to be harder than reviews. Short of
introducing another count to keep track of interday and intraday learning
separately, moving back to the old behaviour seems like the best move.
This also means it is not really practical to apply the review limit to
interday learning cards anymore, as the limit would be split between two
different numbers, and how much each number is capped would depend on
the order cards are introduced. The scheduler could figure this out, but
the deck list code does not know card order, and would need significant
changes to be able to produce numbers that matched the scheduler. And
even if we ignore implementation complexities, I think it would be more
difficult for users to reason about - the influence of the review limit
on new cards is confusing enough as it is.
This was flawed - while non-Latin text is now acceptable
in an IRI, we still need to be concerned with reserved characters
such as spaces, and Anki unfortunately has been storing the filenames
in unencoded form in the DB, meaning we must encode them at display
time. We won't be able to move away from this until existing notes
are rewritten, and it will probably require breaking compatibility with
older clients.
https://en.wikipedia.org/wiki/Internationalized_Resource_Identifier
This reverts commit 14110add55.
Will allow importing the Protobuf without pulling in the rest of
the library. This is not a full PEP420 namespace, and the wheel still
bundles everything - it just makes things easier in a Bazel workspace.
I originally tried with PEP420, but it required more invasive changes,
and I ran into issues with mypy.
Back in the WebKit days, images with Unicode filenames would fail to
appear if they weren't percent-escaped. This no longer seems to be the
case - with this patch, images appear correctly on the Mac and Windows
platforms I tested with.
Fixes https://forums.ankiweb.net/t/anki-2-1-45-beta/10664/96Fixes#1219
mypy's move to external types-* packages is a PITA, as it requires them
to be installed in site-packages, and provides no way to specify a custom
site-packages folder, necessitating extra scripts to mock the
site-packages path, and copy+rename the stub packages into a separate
folder.
- changes can now be undone
- the same field can now be mapped to multiple target fields, allowing
fields to be cloned
- the old Qt dialog has been removed
- the old col.models.change() API calls the new code, to avoid
breaking existing consumers. It requires the field map to always
be passed in, but that appears to have been the common case.
- closes#1175
- Daily limits are no longer inherited - each deck limits its own
cards, and the selected deck enforces a maximum limit.
- Fetching of review cards now uses a single query, and sorts in advance.
In collections with a large number of overdue cards and decks, this is
faster than iterating over each deck in turn.
- Include interday learning count in review count & review limit, and
allow them to be buried.
- Warn when parent review limit is lower than child deck in deck options.
- Cap the new card limit to the review limit.
- Add option to control whether new card fetching short-circuits.
The explicit flush was clearing undo history, and the hook will need
re-working to support propagating OpChanges correctly. It will likely
come back as a GUI hook, instead of one in pylib.