chore: Replace python parquet generation script with ts#1876
chore: Replace python parquet generation script with ts#1876ZhongpinWang wants to merge 8 commits into
Conversation
SummaryThe following content is AI-generated and provides a summary of the pull request: Replace Python Parquet Generation Script with TypeScriptChore♻️ Replaces the existing Python script for generating Parquet files with TypeScript equivalents, aligning the tooling with the rest of the project's TypeScript-based stack. Changes
PR Bot InformationVersion:
|
| "typescript": "^6.0.3", | ||
| "zod": "^4.4.3" | ||
| "zod": "^4.4.3", | ||
| "@dsnp/parquetjs": "^1.8.7" |
There was a problem hiding this comment.
[pp] I would prefer something lighter (dependency-tree wise), e.g. hyparquet-writer, parquet-wasm or @duckdb/duckdb-wasm.
There was a problem hiding this comment.
I picked this one only because it is quite popular. Also different parquet library has different implementation of the protocol. For now I can't really test if the generated parquet works with our services (as they might expect parquet file exported from HANA). We can hold this PR for a moment and see if the package in the end works.
There was a problem hiding this comment.
[nth] Consider enabling compression. This file is quite a bit larger despite similar contents.
Update:
Since it is at the moment not possible to test if the generated parquet works with the context registry service, I would pause this PR for a moment.
As in the title.
Also added a flag to the generation script to leave out items with
[PREDICT]placeholder. This is a preparation step for the upcoming context registry feature support.