add GHA routing regression pipeline skeleton#14
Conversation
| -Ccmake.define.ENABLE_TOOLS=ON \ | ||
| -Ccmake.define.ENABLE_SERVICES=OFF \ | ||
| -Ccmake.define.ENABLE_CCACHE=ON |
There was a problem hiding this comment.
this is the kind of thing I expected you to get from the old PR. it's not the most relevant thing, but this is at least the 3rd time that I need to flag it in a review!
There was a problem hiding this comment.
Removed ENABLE_TOOLS=ON (default ON, not needed explicitly) and added ENABLE_GEOTIFF=OFF and ENABLE_LZ4=OFF as suggested. Kept only flags that differ from CMakeLists.txt defaults plus ENABLE_CCACHE=ON explicitly since it's important for our caching.
| - name: Bundle tiles into a single extract | ||
| id: extract | ||
| shell: bash | ||
| run: | | ||
| valhalla_build_extract -c /tmp/valhalla-build.json -v | ||
| tile_count=$(find "${{ inputs.tile_dir }}" -name "*.gph" | wc -l) | ||
| echo "built $tile_count tile(s)" | ||
| test -f "${{ inputs.tile_dir }}/tiles.tar" | ||
| echo "tile_extract=$(pwd)/${{ inputs.tile_dir }}/tiles.tar" >> "$GITHUB_OUTPUT" |
There was a problem hiding this comment.
it's fairly meaningless to have this as a separate build step. nothing is different here. I'd prefer to push this bash code into e.g. ./.github/workflows/scripts (make sure scripts isn't somehow reserved!). then we can run shell-check on it and proper syntax hightlighting in IDEs etc.
There was a problem hiding this comment.
Moved the bundle logic into .github/workflows/scripts/build_extract.sh and merged both steps into one. Verified with shellcheck -clean.
nilsnolde
left a comment
There was a problem hiding this comment.
there's one basic architectural problem: you need to run the requests on a single graph, so you don't muddy its value for QA. currently we have no idea if old/new graph build would be problematic, or old/new service is, everything's mixed together.
plus a few more problems (some small yet very old, fixed lots of times before), some new ones, plus a few questions
| git config user.email "github-actions[bot]@users.noreply.github.com" | ||
| git add diffs/${{ github.run_id }}.json | ||
| git commit -m "regression: ${{ inputs.valhalla_ref_old }} vs ${{ inputs.valhalla_ref_new }} (run ${{ github.run_id }})" | ||
| git push https://x-access-token:${{ secrets.RAD_DATA_TOKEN }}@github.com/valhalla/RAD-data.git |
There was a problem hiding this comment.
is there a way to get rid of the input token? since we control both repos? some permissions one can set maybe?
There was a problem hiding this comment.
Not with just workflow permissions, those are scoped to the repo the workflow runs in. Options I see: a GitHub App with org-level install, or keeping the PAT.
Ano. option that avoids a PAT entirely: flip the push direction. RAD-data exposes a workflow_call workflow that RAD triggers, RAD-data then writes to itself using its own GITHUB_TOKEN. Needs RAD-data's Actions settings set to "Accessible from repositories in the organization" and a reusable workflow added there. Alternatively, keeping the PAT is simpler for now. What do you think?
| - name: Build tiles for ${{ matrix.label }} ref | ||
| id: tiles | ||
| uses: ./.github/actions/build-tiles | ||
| with: | ||
| pbf_path: data/liechtenstein_graph.osm.pbf | ||
|
|
There was a problem hiding this comment.
there's a mistake hidden here: currently we'd build graphs for both SHAs. but we said we'd only want a single graph (e.g. master) right now. otherwise we can't tell what the problem is (graph build or service runtime?). that's the whole point of the "scenarios" in the initial issue, the base for this whole project.
There was a problem hiding this comment.
so for now: 1 single graph. 2 different service runtimes (the pyvalhalla stuff below). ideally in a way so that it'll be easier to add future scenarios, but I think that'll need a bigger re-write than necessary and we'll park that for later.
feel free to just use artifacts for the master graph for now. we'll revisit later.
| from valhalla.config import _sanitize_config, default_config | ||
|
|
||
|
|
||
| def main(): |
There was a problem hiding this comment.
I didn't communicate that so far and it's not necessarily obvious I'm sure, maybe it's even personal preference and not "officially best-practice": IMO in OSS, if we write code in a PR, then that code should be defensible. whatever lands in master should not ever mystically crash when the input is a bit different from what you provided in your "does it work" session. if there are TODOs left which are not taken care of yet (add_argument should have proper types, here: Path, not str; or error handling at large!), then they need to be called out, even when we're still in full dev mode and nothing can be expected to work. it's still our responsibility to make that clear if we decide to share our project IMO.
feel free to add a folder debug and dump some scripts there that'll help you in the future. you could even reference it in the current GHA yml and leave a fat TODO there, that this is WIP.
remember, you're in public here, and on a pretty big project where it's reasonable to assume that some people track this progress. they shouldn't be misled that master is in a working state when looking at the code.
There was a problem hiding this comment.
in the end: I'm not really reviewing any code in these files right now, they're far from finished! again, it's fine, just not this exact way.
There was a problem hiding this comment.
Moved both scripts to debug/ with TODO markers in the workflow noting they're WIP.
| p.add_argument("--output", required=True) | ||
| args = p.parse_args() | ||
|
|
||
| config = _sanitize_config(default_config.copy()) |
There was a problem hiding this comment.
you need to call these things out in the PR description at the very least! you're using internal API here. if this was C++, not python, this would not compile. this line here is clearly a bug you caught in pyvalhalla, that needs to go to https://github.com/valhalla/valhalla/issues, and be fixed before we use it.
There was a problem hiding this comment.
argh. this (and your slack message) sounded weird to me, so I had a look at the source of the bug: it's valhalla/valhalla#5976. I totally forgot to check the python bindings! (maybe that even fixes the reason of valhalla/valhalla#6145?)
I'll take care of this now. you don't need to change anything, other than taking one of the options you have not commit those scripts like this to master.
There was a problem hiding this comment.
On the _sanitize_config workaround - noted, and thanks for tracking down the upstream issue. Will update once the fix lands.
| } | ||
|
|
||
|
|
||
| def main(): |
There was a problem hiding this comment.
same here. this is very much WIP.
There was a problem hiding this comment.
looking at its current state, it's too busy with words. but it's good to have it started, we'll need to change things around anyways in the future.
There was a problem hiding this comment.
Noted, will trim it down in a follow-up once the architecture settles.
Implements the routing regression pipeline skeleton discussed in the architecture doc.
Architecture preview
Closes #10