Counter Strike raw data availability
March 21, 2025
I’m back with another CS related project! (previously a custom replay watcher prototype, “demo to text” visualisation tool, the seeds of today's project).
It's been in the back of my mind for years that it would be great if the barrier to entry for getting access to raw Counter Strike pro match data was lowered. Specifically for people who want to do analysis and visualisation work.
Working on the assumption that there's a bunch of people out there who are scraping (or manually downloading) data from HLTV, extracting all the demos, figuring out how to parse them all to some usable format, figuring out how to query that format in aggregate.
My current plan is to do all this boring leg work, parse all pro demos and save the results into a few massive database tables. Which makes basic things very simple, and drastically simplifies the complex things.
As a basic example, with the ability to query all "damages" we can figure out where people receive the most damage:
SELECT map_name, victim_place, SUM(dmg_health) AS total_damage
FROM damages
GROUP BY map_name, victim_place
ORDER BY total_damage desc;
If you're curious, the result of the above query for the Shanghai Major & ESL Pro League 21 combined is as follows:
Map | Place | Damage |
---|---|---|
de_mirage | BombsiteA | 139'628 |
de_inferno | BombsiteB | 135'064 |
de_inferno | Banana | 124'814 |
Or maybe "who fired which weapon the most?"
SELECT player_name, weapon, COUNT(*) AS shots_fired_count
FROM shots
GROUP BY player_name, weapon
ORDER BY shots_fired_count DESC;
Gets us:
Senzu weapon_ak47 5328
donk weapon_ak47 4963
NertZ weapon_ak47 4150
yuurih weapon_ak47 3897
NiKo weapon_ak47 3877
apEX weapon_ak47 3784
These are basic examples, but you hopefully get the idea of what's possible.
If this sounds interesting to you, or you work in the analysis / visualisation space I'd love to hear from you! I've set up a form to get feedback and there's a link to a discord server at the end.
🤓