|
|
1 год назад | |
|---|---|---|
| src | 1 год назад | |
| static | 1 год назад | |
| templates | 1 год назад | |
| tests | 1 год назад | |
| .gitignore | 1 год назад | |
| CHANGELOG.md | 1 год назад | |
| README.md | 1 год назад | |
| config.py | 1 год назад | |
| install-git-hooks.sh | 1 год назад | |
| pyproject.toml | 1 год назад | |
| requirements-dev.lock | 1 год назад | |
| requirements-embeddings.lock | 1 год назад | |
| requirements.lock | 1 год назад | |
| settings.toml | 1 год назад |
featured_article_snapshot_id in snapshot_apparitions viewIn the featured_article_snapshot_id view, the field featured_article_snapshot_id is taken as if it was unique by row, but it is not.
This can be easily checked with this query :
SELECT * FROM (
SELECT featured_article_snapshot_id, json_group_array(snapshot_id), COUNT(*) as count
FROM snapshot_apparitions
WHERE is_main -- Not required
GROUP BY featured_article_snapshot_id
)
WHERE count > 1
Among other things it leads to "deadends" while browsing the UI, likely because the timestamp search and time diff relies on this false assumption.
2024-05-23 : This is likely not relevant anymore now that the URLs include the timestamp and not the snapshot_id.
The snapshot process ends up choosing the same snapshot for different virtual timestamps.
This can be checked with this query :
SELECT
sv.id, sv.site_id, sv2.id, sv2.site_id, sv.timestamp_virtual, sv2.timestamp_virtual, sv2.timestamp
FROM snapshots_view sv
CROSS JOIN snapshots_view sv2
WHERE
sv.id != sv2.id
and sv.timestamp = sv2.timestamp
Some snapshots are chosen even though they are up to 5/6 hours too early / too late.
SELECT timestamp-timestamp_virtual AS difference, * FROM snapshots_view
ORDER BY difference