16 Commits

Author SHA1 Message Date
sam cd81b7514a Tracked mca_querying.py. 2026-05-26 17:33:25 +01:00
sam c7bb1608a1 Progress towards sensible-enough numpy based querying using shared memory views rather than duplicated data. 2026-05-26 17:32:56 +01:00
sam b9ffbdee89 Created mca_stubs.py which provides record entry details (and slice objects) for the .MCA types. 2026-05-26 11:06:28 +01:00
sam e887cc791e Removal of sqlite/sqlalchemy based approach - it is too slow to combine the results, even with in-memory database loading. 2026-05-26 10:17:28 +01:00
sam 0479f1e4a8 Improved a few things, querying for multiple services now runs at a tolerable speed. Would prefer if it could be improved further, will look at pre-merging tables using sql rather than pandas. 2026-05-25 21:21:53 +01:00
sam e723109a0a Begun creating some utility functions and noticed some limitations. Fetching one schedule at a time is too slow, and we could easily split an aggregated result. 2026-05-25 17:46:13 +01:00
sam 36aa23f464 Various minor updates, basic Schedule class. Added a SixData class to manage conversions of YYMMDD to/from more pythonic objects. 2026-05-25 14:06:35 +01:00
sam c2633952d3 Added mca_queries.py and it's pre-generated result mca_record_types.py. The latter is for type hinting and will make writing queries to solve for schedule numbers much easier. Next will be to write tools to make hunting for desired schedules easier. 2026-05-25 13:26:11 +01:00
sam 51c4f5030c Updated the raw_mca_... table generation to include line number from the file, and schedule number - although we may need to investigate how the last entry behaves with 'ZZ' records and any others. We don't want to inherit the technical debt of remembering this one case every time. 2026-05-23 10:38:03 +01:00
sam f35cda6f10 Finished parsing.py initial implementation, now have a sqlite database generating >600MB of timetable records. Next will be generating sqlalchemy desriptors based on the automated specifications. If I can re-learn sqlalchemy that is. 2026-05-22 16:57:14 +01:00
sam d63f151c9b Added sqlite export of .MCA file's record spec. This won't live in /data, but in a user's cache. This is to allow user choice on how and when to update the timetable files and reduce redundancy. 2026-05-22 11:59:45 +01:00
sam fc09eb775e Parsing of RSP's specification now cached to /data. This means that we can ship the tables with the project, rather than the .pdf being a requirement of use. 2026-05-22 10:37:42 +01:00
sam 14b17a22d7 Used pypdf to create extract_specification_document_tables in parsing.py. Should allow easier indexing of the various file types in future. Will need to adapt for files other than .MCA and look at formalising into a local database. 2026-05-22 01:11:24 +01:00
sam 51c6af9782 Tracked nr_requests.py and added fetch_nr_timetable_files. 2026-05-21 20:23:43 +01:00
sam f454af8ab4 Added NRConfig and fetch_nr_token in nr_requests.py. 2026-05-21 18:52:24 +01:00
sam 0ce0fe8610 Init. 2026-05-21 18:00:03 +01:00