OpenStreetMap (OSM) is a global project formed around a geographic information database which is being filled by all comers — both enthusiasts and interested companies. Anybody can contribute, but the openness has its downside: incorrect edits often get into the database. Hence plenty of validators of OSM data have been written which allow to maintain the data quality at an acceptable level.
Since 2016 there exists an open source subway preprocessor that validates (generates error reports) rapid transit routes in OSM for completeness and logical/topological errors, and converts them into formats that are suitable for routing and rendering, e.g. GTFS. Besides OSM data it takes a list of public transport (PT) networks which contains the checking information about the number of lines, stations etc. per a PT network. The preprocessor has successfully proven itself in the preparation of PT data for applications such as Maps.me and Organic Maps.
In this article, I would like to share an approach to detecting one of the types of errors that occur quite often in OSM data and automatic detection of which is somewhat challenging. It's an accidental loss of a station from a route. The source code of the validator and the described algorithm are open source. But first, let's define the concepts used to represent PT data in OpenStreetMap.