Counter Strike raw data availability

March 21, 2025

I’m back with another CS related project! (previously a custom replay watcher prototype, “demo to text” visualisation tool, the seeds of today's project).

It's been in the back of my mind for years that it would be great if the barrier to entry for getting access to raw Counter Strike pro match data was lowered. Specifically for people who want to do analysis and visualisation work.

Working on the assumption that there's a bunch of people out there who are scraping (or manually downloading) data from HLTV, extracting all the demos, figuring out how to parse them all to some usable format, figuring out how to query that format in aggregate.

My current plan is to do all this boring leg work, parse all pro demos and save the results into a few massive database tables. Which makes basic things very simple, and drastically simplifies the complex things.

As a basic example, with the ability to query all "damages" we can figure out where people receive the most damage:

SELECT map_name, victim_place, SUM(dmg_health) AS total_damage
FROM damages
GROUP BY map_name, victim_place
ORDER BY total_damage desc;

If you're curious, the result of the above query for the Shanghai Major & ESL Pro League 21 combined is as follows:

Map Place Damage
de_mirage BombsiteA 139'628
de_inferno BombsiteB 135'064
de_inferno Banana 124'814

Or maybe "who fired which weapon the most?"

SELECT player_name, weapon, COUNT(*) AS shots_fired_count
FROM shots
GROUP BY player_name, weapon
ORDER BY shots_fired_count DESC;

Gets us:

Senzu   weapon_ak47     5328
donk    weapon_ak47     4963
NertZ   weapon_ak47     4150
yuurih  weapon_ak47     3897
NiKo    weapon_ak47     3877
apEX    weapon_ak47     3784

These are basic examples, but you hopefully get the idea of what's possible.

If this sounds interesting to you, or you work in the analysis / visualisation space I'd love to hear from you! I've set up a form to get feedback and there's a link to a discord server at the end.

🤓