anki/qt/aqt/mediasrv.py
Damien Elmes 45f5709214
Migrate to protobuf-es (#2547)
* Fix .no-reduce-motion missing from graphs spinner, and not being honored

* Begin migration from protobuf.js -> protobuf-es

Motivation:

- Protobuf-es has a nicer API: messages are represented as classes, and
fields which should exist are not marked as nullable.
- As it uses modules, only the proto messages we actually use get included
in our bundle output. Protobuf.js put everything in a namespace, which
prevented tree-shaking, and made it awkward to access inner messages.
- ./run after touching a proto file drops from about 8s to 6s on my machine. The tradeoff
is slower decoding/encoding (#2043), but that was mainly a concern for the
graphs page, and was unblocked by
37151213cd

Approach/notes:

- We generate the new protobuf-es interface in addition to existing
protobuf.js interface, so we can migrate a module at a time, starting
with the graphs module.
- rslib:proto now generates RPC methods for TS in addition to the Python
interface. The input-arg-unrolling behaviour of the Python generation is
not required here, as we declare the input arg as a PlainMessage<T>, which
marks it as requiring all fields to be provided.
- i64 is represented as bigint in protobuf-es. We were using a patch to
protobuf.js to get it to output Javascript numbers instead of long.js
types, but now that our supported browser versions support bigint, it's
probably worth biting the bullet and migrating to bigint use. Our IDs
fit comfortably within MAX_SAFE_INTEGER, but that may not hold for future
fields we add.
- Oneofs are handled differently in protobuf-es, and are going to need
some refactoring.

Other notable changes:

- Added a --mkdir arg to our build runner, so we can create a dir easily
during the build on Windows.
- Simplified the preference handling code, by wrapping the preferences
in an outer store, instead of a separate store for each individual
preference. This means a change to one preference will trigger a redraw
of all components that depend on the preference store, but the redrawing
is cheap after moving the data processing to Rust, and it makes the code
easier to follow.
- Drop async(Reactive).ts in favour of more explicit handling with await
blocks/updating.
- Renamed add_inputs_to_group() -> add_dependency(), and fixed it not adding
dependencies to parent groups. Renamed add() -> add_action() for clarity.

* Remove a couple of unused proto imports

* Migrate card info

* Migrate congrats, image occlusion, and tag editor

+ Fix imports for multi-word proto files.

* Migrate change-notetype

* Migrate deck options

* Bump target to es2020; simplify ts lib list

Have used caniuse.com to confirm Chromium 77, iOS 14.5 and the Chrome
on Android support the full es2017-es2020 features.

* Migrate import-csv

* Migrate i18n and fix missing output types in .js

* Migrate custom scheduling, and remove protobuf.js

To mostly maintain our old API contract, we make use of protobuf-es's
ability to convert to JSON, which follows the same format as protobuf.js
did. It doesn't cover all case: users who were previously changing the
variant of a type will need to update their code, as assigning to a new
variant no longer automatically removes the old one, which will cause an
error when we try to convert back from JSON. But I suspect the large majority
of users are adjusting the current variant rather than creating a new one,
and this saves us having to write proxy wrappers, so it seems like a
reasonable compromise.

One other change I made at the same time was to rename value->kind for
the oneofs in our custom study protos, as 'value' was easily confused
with the 'case/value' output that protobuf-es has.

With protobuf.js codegen removed, touching a proto file and invoking
./run drops from about 8s to 6s.

This closes #2043.

* Allow tree-shaking on protobuf types

* Display backend error messages in our ts alert()

* Make sourcemap generation opt-in for ts-run

Considerably slows down build, and not used most of the time.
2023-06-14 22:47:37 +10:00

551 lines
16 KiB
Python

# Copyright: Ankitects Pty Ltd and contributors
# License: GNU AGPL, version 3 or later; http://www.gnu.org/licenses/agpl.html
from __future__ import annotations
import logging
import mimetypes
import os
import re
import sys
import threading
import time
import traceback
from dataclasses import dataclass
from http import HTTPStatus
from typing import Callable
import flask
import flask_cors
import stringcase
from flask import Response, request
from waitress.server import create_server
import aqt
import aqt.main
import aqt.operations
from anki import hooks
from anki.collection import OpChanges
from anki.decks import UpdateDeckConfigs
from anki.scheduler.v3 import SchedulingStatesWithContext, SetSchedulingStatesRequest
from anki.utils import dev_mode
from aqt.changenotetype import ChangeNotetypeDialog
from aqt.deckoptions import DeckOptionsDialog
from aqt.import_export.import_csv_dialog import ImportCsvDialog
from aqt.operations.deck import update_deck_configs as update_deck_configs_op
from aqt.qt import *
from aqt.utils import aqt_data_path
app = flask.Flask(__name__, root_path="/fake")
flask_cors.CORS(app)
@dataclass
class LocalFileRequest:
# base folder, eg media folder
root: str
# path to file relative to root folder
path: str
@dataclass
class BundledFileRequest:
# path relative to aqt data folder
path: str
@dataclass
class NotFound:
message: str
DynamicRequest = Callable[[], Response]
class MediaServer(threading.Thread):
_ready = threading.Event()
daemon = True
def __init__(self, mw: aqt.main.AnkiQt) -> None:
super().__init__()
self.is_shutdown = False
# map of webview ids to pages
self._page_html: dict[int, str] = {}
def run(self) -> None:
try:
if dev_mode:
# idempotent if logging has already been set up
logging.basicConfig()
logging.getLogger("waitress").setLevel(logging.ERROR)
desired_host = os.getenv("ANKI_API_HOST", "127.0.0.1")
desired_port = int(os.getenv("ANKI_API_PORT", "0"))
self.server = create_server(
app,
host=desired_host,
port=desired_port,
clear_untrusted_proxy_headers=True,
)
if dev_mode:
print(
"Serving on http://%s:%s"
% (self.server.effective_host, self.server.effective_port) # type: ignore
)
self._ready.set()
self.server.run()
except Exception:
if not self.is_shutdown:
raise
def shutdown(self) -> None:
self.is_shutdown = True
sockets = list(self.server._map.values()) # type: ignore
for socket in sockets:
socket.handle_close()
# https://github.com/Pylons/webtest/blob/4b8a3ebf984185ff4fefb31b4d0cf82682e1fcf7/webtest/http.py#L93-L104
self.server.task_dispatcher.shutdown()
def getPort(self) -> int:
self._ready.wait()
return int(self.server.effective_port) # type: ignore
def set_page_html(self, id: int, html: str) -> None:
self._page_html[id] = html
def get_page_html(self, id: int) -> str | None:
return self._page_html.get(id)
def clear_page_html(self, id: int) -> None:
try:
del self._page_html[id]
except KeyError:
pass
@app.route("/favicon.ico")
def favicon() -> Response:
request = BundledFileRequest(os.path.join("imgs", "favicon.ico"))
return _handle_builtin_file_request(request)
def _mime_for_path(path: str) -> str:
"Mime type for provided path/filename."
if path.endswith(".css"):
# some users may have invalid mime type in the Windows registry
return "text/css"
elif path.endswith(".js"):
return "application/javascript"
else:
# autodetect
mime, _encoding = mimetypes.guess_type(path)
return mime or "application/octet-stream"
def _handle_local_file_request(request: LocalFileRequest) -> Response:
directory = request.root
path = request.path
try:
isdir = os.path.isdir(os.path.join(directory, path))
except ValueError:
return flask.make_response(
f"Path for '{directory} - {path}' is too long!",
HTTPStatus.BAD_REQUEST,
)
directory = os.path.realpath(directory)
path = os.path.normpath(path)
fullpath = os.path.abspath(os.path.join(directory, path))
# protect against directory transversal: https://security.openstack.org/guidelines/dg_using-file-paths.html
if not fullpath.startswith(directory):
return flask.make_response(
f"Path for '{directory} - {path}' is a security leak!",
HTTPStatus.FORBIDDEN,
)
if isdir:
return flask.make_response(
f"Path for '{directory} - {path}' is a directory (not supported)!",
HTTPStatus.FORBIDDEN,
)
try:
mimetype = _mime_for_path(fullpath)
if os.path.exists(fullpath):
if fullpath.endswith(".css"):
# caching css files prevents flicker in the webview, but we want
# a short cache
max_age = 10
elif fullpath.endswith(".js"):
# don't cache js files
max_age = 0
else:
max_age = 60 * 60
return flask.send_file(
fullpath, mimetype=mimetype, conditional=True, max_age=max_age, download_name="foo" # type: ignore[call-arg]
)
else:
print(f"Not found: {path}")
return flask.make_response(
f"Invalid path: {path}",
HTTPStatus.NOT_FOUND,
)
except Exception as error:
if dev_mode:
print(
"Caught HTTP server exception,\n%s"
% "".join(traceback.format_exception(*sys.exc_info())),
)
# swallow it - user likely surfed away from
# review screen before an image had finished
# downloading
return flask.make_response(
str(error),
HTTPStatus.INTERNAL_SERVER_ERROR,
)
def _builtin_data(path: str) -> bytes:
"""Return data from file in aqt/data folder.
Path must use forward slash separators."""
# packaged build?
if getattr(sys, "frozen", False):
reader = aqt.__loader__.get_resource_reader("_aqt") # type: ignore
with reader.open_resource(path) as f:
return f.read()
else:
full_path = aqt_data_path() / ".." / path
return full_path.read_bytes()
def _handle_builtin_file_request(request: BundledFileRequest) -> Response:
path = request.path
mimetype = _mime_for_path(path)
data_path = f"data/web/{path}"
try:
data = _builtin_data(data_path)
return Response(data, mimetype=mimetype)
except FileNotFoundError:
if dev_mode:
print(f"404: {data_path}")
return flask.make_response(
f"Invalid path: {path}",
HTTPStatus.NOT_FOUND,
)
except Exception as error:
if dev_mode:
print(
"Caught HTTP server exception,\n%s"
% "".join(traceback.format_exception(*sys.exc_info())),
)
# swallow it - user likely surfed away from
# review screen before an image had finished
# downloading
return flask.make_response(
str(error),
HTTPStatus.INTERNAL_SERVER_ERROR,
)
@app.route("/<path:pathin>", methods=["GET", "POST"])
def handle_request(pathin: str) -> Response:
request = _extract_request(pathin)
if dev_mode:
print(f"{time.time():.3f} {flask.request.method} /{pathin}")
if isinstance(request, NotFound):
print(request.message)
return flask.make_response(
f"Invalid path: {pathin}",
HTTPStatus.NOT_FOUND,
)
elif callable(request):
return _handle_dynamic_request(request)
elif isinstance(request, BundledFileRequest):
return _handle_builtin_file_request(request)
elif isinstance(request, LocalFileRequest):
return _handle_local_file_request(request)
else:
return flask.make_response(
f"unexpected request: {pathin}",
HTTPStatus.FORBIDDEN,
)
def _extract_internal_request(
path: str,
) -> BundledFileRequest | DynamicRequest | NotFound | None:
"Catch /_anki references and rewrite them to web export folder."
prefix = "_anki/"
if not path.startswith(prefix):
return None
dirname = os.path.dirname(path)
filename = os.path.basename(path)
additional_prefix = None
if dirname == "_anki":
if flask.request.method == "POST":
return _extract_collection_post_request(filename)
elif get_handler := _extract_dynamic_get_request(filename):
return get_handler
# remap legacy top-level references
base, ext = os.path.splitext(filename)
if ext == ".css":
additional_prefix = "css/"
elif ext == ".js":
if base in ("browsersel", "jquery-ui", "jquery", "plot"):
additional_prefix = "js/vendor/"
else:
additional_prefix = "js/"
# handle requests for vendored libraries
elif dirname == "_anki/js/vendor":
base, ext = os.path.splitext(filename)
if base == "jquery":
base = "jquery.min"
additional_prefix = "js/vendor/"
elif base == "jquery-ui":
base = "jquery-ui.min"
additional_prefix = "js/vendor/"
elif base == "browsersel":
base = "css_browser_selector.min"
additional_prefix = "js/vendor/"
if additional_prefix:
oldpath = path
path = f"{prefix}{additional_prefix}{base}{ext}"
print(f"legacy {oldpath} remapped to {path}")
return BundledFileRequest(path=path[len(prefix) :])
def _extract_addon_request(path: str) -> LocalFileRequest | NotFound | None:
"Catch /_addons references and rewrite them to addons folder."
prefix = "_addons/"
if not path.startswith(prefix):
return None
addon_path = path[len(prefix) :]
try:
manager = aqt.mw.addonManager
except AttributeError as error:
if dev_mode:
print(f"_redirectWebExports: {error}")
return None
try:
addon, sub_path = addon_path.split("/", 1)
except ValueError:
return None
if not addon:
return None
pattern = manager.getWebExports(addon)
if not pattern:
return None
if re.fullmatch(pattern, sub_path):
return LocalFileRequest(root=manager.addonsFolder(), path=addon_path)
return NotFound(message=f"couldn't locate item in add-on folder {path}")
def _extract_request(
path: str,
) -> LocalFileRequest | BundledFileRequest | DynamicRequest | NotFound:
if internal := _extract_internal_request(path):
return internal
elif addon := _extract_addon_request(path):
return addon
if not aqt.mw.col:
return NotFound(message=f"collection not open, ignore request for {path}")
path = hooks.media_file_filter(path)
return LocalFileRequest(root=aqt.mw.col.media.dir(), path=path)
def congrats_info() -> bytes:
if not aqt.mw.col.sched._is_finished():
aqt.mw.taskman.run_on_main(lambda: aqt.mw.moveToState("review"))
return raw_backend_request("congrats_info")()
def get_deck_configs_for_update() -> bytes:
return aqt.mw.col._backend.get_deck_configs_for_update_raw(request.data)
def update_deck_configs() -> bytes:
# the regular change tracking machinery expects to be started on the main
# thread and uses a callback on success, so we need to run this op on
# main, and return immediately from the web request
input = UpdateDeckConfigs()
input.ParseFromString(request.data)
def on_success(changes: OpChanges) -> None:
if isinstance(window := aqt.mw.app.activeWindow(), DeckOptionsDialog):
window.reject()
def handle_on_main() -> None:
update_deck_configs_op(parent=aqt.mw, input=input).success(
on_success
).run_in_background()
aqt.mw.taskman.run_on_main(handle_on_main)
return b""
def get_scheduling_states_with_context() -> bytes:
return SchedulingStatesWithContext(
states=aqt.mw.reviewer.get_scheduling_states(),
context=aqt.mw.reviewer.get_scheduling_context(),
).SerializeToString()
def set_scheduling_states() -> bytes:
states = SetSchedulingStatesRequest()
states.ParseFromString(request.data)
aqt.mw.reviewer.set_scheduling_states(states)
return b""
def change_notetype() -> bytes:
data = request.data
def handle_on_main() -> None:
window = aqt.mw.app.activeWindow()
if isinstance(window, ChangeNotetypeDialog):
window.save(data)
aqt.mw.taskman.run_on_main(handle_on_main)
return b""
def import_csv() -> bytes:
data = request.data
def handle_on_main() -> None:
window = aqt.mw.app.activeWindow()
if isinstance(window, ImportCsvDialog):
window.do_import(data)
aqt.mw.taskman.run_on_main(handle_on_main)
return b""
post_handler_list = [
congrats_info,
get_deck_configs_for_update,
update_deck_configs,
get_scheduling_states_with_context,
set_scheduling_states,
change_notetype,
import_csv,
]
exposed_backend_list = [
# DeckService
"get_deck_names",
# I18nService
"i18n_resources",
# ImportExportService
"get_csv_metadata",
# NotesService
"get_field_names",
"get_note",
# NotetypesService
"get_notetype_names",
"get_change_notetype_info",
# StatsService
"card_stats",
"graphs",
"get_graph_preferences",
"set_graph_preferences",
# TagsService
"complete_tag",
# ImageOcclusionService
"get_image_for_occlusion",
"add_image_occlusion_note",
"get_image_occlusion_note",
"update_image_occlusion_note",
]
def raw_backend_request(endpoint: str) -> Callable[[], bytes]:
# check for key at startup
from anki._backend import RustBackend
assert hasattr(RustBackend, f"{endpoint}_raw")
return lambda: getattr(aqt.mw.col._backend, f"{endpoint}_raw")(request.data)
# all methods in here require a collection
post_handlers = {
stringcase.camelcase(handler.__name__): handler for handler in post_handler_list
} | {
stringcase.camelcase(handler): raw_backend_request(handler)
for handler in exposed_backend_list
}
def _extract_collection_post_request(path: str) -> DynamicRequest | NotFound:
if not aqt.mw.col:
return NotFound(message=f"collection not open, ignore request for {path}")
if handler := post_handlers.get(path):
# convert bytes/None into response
def wrapped() -> Response:
try:
if data := handler():
response = flask.make_response(data)
response.headers["Content-Type"] = "application/binary"
else:
response = flask.make_response("", HTTPStatus.NO_CONTENT)
except Exception as exc:
print(traceback.format_exc())
response = flask.make_response(
str(exc), HTTPStatus.INTERNAL_SERVER_ERROR
)
return response
return wrapped
else:
return NotFound(message=f"{path} not found")
def _handle_dynamic_request(request: DynamicRequest) -> Response:
try:
return request()
except Exception as e:
return flask.make_response(str(e), HTTPStatus.INTERNAL_SERVER_ERROR)
def legacy_page_data() -> Response:
id = int(request.args["id"])
if html := aqt.mw.mediaServer.get_page_html(id):
return Response(html, mimetype="text/html")
else:
return flask.make_response("page not found", HTTPStatus.NOT_FOUND)
# this currently only handles a single method; in the future, idempotent
# requests like i18nResources should probably be moved here
def _extract_dynamic_get_request(path: str) -> DynamicRequest | None:
if path == "legacyPageData":
return legacy_page_data
else:
return None