BadB
Professional
- Messages
- 2,557
- Reaction score
- 2,744
- Points
- 113
Mobile operators receive a lot of data and metadata, which can be used to learn a lot about the life of an individual subscriber. Once you understand how this data is processed and stored, you will be able to track the entire chain of information passing from a call to debiting money. If we talk about the model of an internal violator, then the possibilities are even greater, because data protection is not included in the tasks of pre-billing systems at all.
First, you need to take into account that subscriber traffic in the network of the Telecom operator is generated and received from different equipment. This equipment can generate files with records (CDR files, RADIUS logs, ASCII text) and work using various protocols (NetFlow, SNMP, SOAP). And you need to control all this fun and unfriendly dance, capture data, process it and transmit it further to the billing system in a format that will be pre-standardized.
At the same time, subscriber data runs everywhere, and it is advisable not to provide access to outsiders. How secure is the information in such a system, taking into account all the chains? Let's get this straight.
Why do Telecom operators need prebilling?
It is assumed that subscribers want to receive more and more new and modern types of services, but you can not constantly change the equipment for this. Therefore, pre-billing should deal with the implementation of new services and ways to provide them this is its first task. The second is traffic analysis, checking its correctness, completeness of uploading to subscriber billing, and preparing data for billing.
Various data reconciliation and reloading operations are implemented using prebilling. For example, checking the status of services on hardware and in billing. Sometimes, the subscriber uses the services even though they are already blocked in billing. Or he used the services, but the equipment did not receive any records about it. There can be many situations, and most of these issues are resolved using pre-billing.
You can endlessly customize and improve processing, but always at some point circumstances and data will develop so that an exception will occur. You can perfectly build a system for working and monitoring auxiliary billing and pre-billing systems, but it is impossible to ensure uninterrupted operation of equipment and data transmission channels.
This is why there is a duplicate system that verifies data in billing and data that went from pre-billing to billing. Its task is to catch something that left the equipment, but for some reason "did not fall on the subscriber". This role is usually played by the FMS - Fraud Management System, which duplicates and controls prebilling. Of course, its main purpose is not to control pre-billing at all, but to detect fraudulent schemes and, as a result, monitor data losses and discrepancies from equipment and billing data.
In fact, there are a lot of options for using pre-billing. For example, it may be performing a reconciliation between the subscriber's state on the hardware and in CRM. Such a scheme may look like this.
Another use case is data accumulation and further processing. This option is possible when we have thousands of records from the equipment (GGSN-SGSN, telephony): throwing all these records into the subscriber's detail is utter madness, not to mention the fact that we load all the systems with such a large amount of small data. For this reason, the following scheme is suitable, which solves the problem.
How to pacify the zoo?
To make it clearer how this works and where problems may arise here, let's take the Hewlett-Packard Internet Usage Manager prebilling system (HP IUM, in the updated version of eIUM) and use it as an example to see how such software works.
Imagine a large meat grinder where you throw meat, vegetables, loaves of bread everything you can. That is, there are a variety of products at the input, but they all take on the same shape at the output. We can change the grid and get a different shape at the output, but the principle and way of processing our products will remain the same-auger, knife, grid. This is the classic pre-billing scheme: data collection, processing, and output. In IUM prebilling, the links in this chain are called encapsulator, aggregator, and datastore.
Here it is necessary to understand that at the input we must have data completeness a certain minimum amount of information, without which further processing is useless. If a certain block or data element is missing, we get an error or a warning that processing is not possible, because operations cannot be performed without this data.
Therefore, it is very important that the equipment generates record files that have a strictly defined set and data type set by the manufacturer. Each type of equipment is a separate processor (collector) that works only with its own input data format. For example, you can't just upload a file from CISCO PGW-SGW equipment with mobile subscriber Internet traffic to a collector that processes the flow from Iskratel Si3000 fixed-line equipment.
If we do this, then at best we will get an exception during processing, and at worst we will have all the processing of a particular thread, since the collector handler will fall with an error and wait until we solve the problem with the file that is" broken " from its point of view. Here you can see that all pre-billing systems, as a rule, critically perceive data that a specific handler-collector was not configured to process.
Initially, the parsed data stream (RAW) is formed at the encapsulator level and can be transformed and filtered here. This is done if you need to make changes to the flow before the aggregation scheme, which should then be applied to the entire data stream (when it passes through various aggregation schemes).
Files (.cdr, .log, etc.) with records about the activity of subscriber users are received both from local sources and from remote ones (FTP, SFTP). other protocols are also possible. The parser parses files using different Java classes.
Since the prebilling system in normal operation is not designed to store the history of processed files (and there may be hundreds of thousands of them per day), after processing, the file on the source is deleted. For various reasons, the file may not always be deleted correctly. As a result, it happens that records from the file are processed repeatedly or with a long delay (when it was possible to delete the file). To prevent such duplicates, there are security mechanisms: checking for duplicates of files or records, checking for time in records, and so on.
One of the most vulnerable points here is data size criticality. The more data we store (in memory, in databases), the slower we process new data, the more resources we consume, and in the end, we still reach the limit after which we are forced to delete old data. Thus, auxiliary databases (MySQL, TimesTen, Oracle, and so on) are usually used to store this metadata. Accordingly, we get another system that affects the work of pre-billing with the resulting security issues.
What's in the black box?
Once upon a time, at the dawn of such systems, languages were used that made it possible to work effectively with regular expressions such as Perl. In fact, almost all prebilling, if we do not take into account working with external systems, is the rules of parsing and converting strings. Naturally, there is nothing better to find here than regular expressions. The ever-growing volume of data and increased criticality for the time of launching a new service to the market made the use of such systems impossible, since testing and making changes took a long time, and scalability was low.
Modern prebilling is a set of modules, usually written in Java, that can be managed in a graphical interface using standard copy, paste, move, and drag operations. Working in this interface is simple and straightforward.
It mainly uses a Linux or Unix based operating system, less often Windows.
The main problems are usually related to the process of testing or detecting errors, since data passes through many chains of rules and is enriched with data from other systems. Seeing what happens to them at each stage is not always convenient and understandable. Therefore, you have to look for the cause, catching changes in the necessary variables using logs.
The weakness of this system is its complexity and human factor. Any exception provokes data loss or incorrect data generation.
Data is processed sequentially. If we have an error exception at the input that does not allow us to correctly accept and process data, the entire input stream gets up or a portion of incorrect data is discarded. The parsed RAW stream goes to the next stage-aggregation. There can be several aggregation schemes, and they are isolated from each other. As if a single stream of water entering the shower, passing through the grate of the watering can, is divided into different streams some thick, others very thin.
After aggregation, the data is ready for delivery to the consumer. Delivery can go either directly to the databases, or write to the file and send it further, or simply write to the pre-billing storage, where they will lie until it is emptied.
After processing at the first level, data can be transferred to the second and further. Such a ladder is necessary to increase the processing speed and load distribution. In the second stage, another stream can be added to our data stream, mixed, shared, copied, combined, and so on. The final stage is always the delivery of data to the systems that consume it.
Pre-billing tasks are not included (and rightly so!):
Privacy policy
Here we have a complete split! Let's start with the fact that data protection is not part of the prebilling task in principle. It is necessary and possible to differentiate access to prebilling at different levels (management interface, operating system), but if we force it to encrypt data, the complexity and processing time will increase so much that it will be completely unacceptable and unsuitable for billing.
Often, the time from using the service to displaying this fact in billing should not exceed several minutes. As a rule, the metadata that is needed for processing a specific piece of data is stored in a database (MySQL, Oracle, Solid). Input and output data are almost always located in the directory of a specific collector stream. Therefore, anyone who is allowed to do so can have access to them (for example, the root user).
The prebilling configuration itself with a set of rules, information about database access, FTP, and so on is stored in encrypted form in a file database. If you don't know the username and password for accessing prebilling, then uploading the configuration is not so easy.
Any changes made to the processing logic (rules) are recorded in the prebilling configuration log file (who changed it, when, and what).
Even if data is transmitted directly through the chains of collector handlers inside prebilling (without uploading it to a file), the data is still temporarily stored as a file in the handler directory, and you can access it if you want.
Data that is processed on prebilling is depersonalized: it does not contain full names, addresses, or passport data. Therefore, even if you get access to this information, you can't find out the subscriber's personal data from here. But you can catch some information by a specific number, IP or other identifier.
With access to the prebilling configuration, you get data to access all the adjacent systems that it works with. As a rule, access to them is restricted directly from the server where prebilling works, but this is not always the case.
If you get to the directories where the file data of handlers is stored, you can make changes to these files, which are waiting to be sent to consumers. These are often just plain text documents. Then the picture is as follows: prebilling received and processed the data, but they did not arrive in the final system they disappeared in a "black hole".
And it will be difficult to find out the reason for these losses, since only part of the data is lost. In any case, you will not be able to emulate the loss if you continue to search for the causes. You can view the input and output data, but you won't be able to figure out where it went. The attacker can only cover his tracks in the operating system.
First, you need to take into account that subscriber traffic in the network of the Telecom operator is generated and received from different equipment. This equipment can generate files with records (CDR files, RADIUS logs, ASCII text) and work using various protocols (NetFlow, SNMP, SOAP). And you need to control all this fun and unfriendly dance, capture data, process it and transmit it further to the billing system in a format that will be pre-standardized.
At the same time, subscriber data runs everywhere, and it is advisable not to provide access to outsiders. How secure is the information in such a system, taking into account all the chains? Let's get this straight.
Why do Telecom operators need prebilling?
It is assumed that subscribers want to receive more and more new and modern types of services, but you can not constantly change the equipment for this. Therefore, pre-billing should deal with the implementation of new services and ways to provide them this is its first task. The second is traffic analysis, checking its correctness, completeness of uploading to subscriber billing, and preparing data for billing.
Various data reconciliation and reloading operations are implemented using prebilling. For example, checking the status of services on hardware and in billing. Sometimes, the subscriber uses the services even though they are already blocked in billing. Or he used the services, but the equipment did not receive any records about it. There can be many situations, and most of these issues are resolved using pre-billing.
You can endlessly customize and improve processing, but always at some point circumstances and data will develop so that an exception will occur. You can perfectly build a system for working and monitoring auxiliary billing and pre-billing systems, but it is impossible to ensure uninterrupted operation of equipment and data transmission channels.
This is why there is a duplicate system that verifies data in billing and data that went from pre-billing to billing. Its task is to catch something that left the equipment, but for some reason "did not fall on the subscriber". This role is usually played by the FMS - Fraud Management System, which duplicates and controls prebilling. Of course, its main purpose is not to control pre-billing at all, but to detect fraudulent schemes and, as a result, monitor data losses and discrepancies from equipment and billing data.
In fact, there are a lot of options for using pre-billing. For example, it may be performing a reconciliation between the subscriber's state on the hardware and in CRM. Such a scheme may look like this.
- Using SOAP prebilling, we get data from the hardware (HSS, VLR, HLR, AUC, EIR).
- Converting the raw source data to the desired format.
- We make a request to related CRM systems (databases, software interfaces).
- We perform data reconciliation.
- Creating exception records.
- Making a request to the CRM system for data synchronization.
- Bottom line a subscriber downloading a movie while roaming in South Africa is blocked with a zero balance and does not go into a wild minus.
Another use case is data accumulation and further processing. This option is possible when we have thousands of records from the equipment (GGSN-SGSN, telephony): throwing all these records into the subscriber's detail is utter madness, not to mention the fact that we load all the systems with such a large amount of small data. For this reason, the following scheme is suitable, which solves the problem.
- Getting data from hardware.
- Data aggregation on prebilling (we are waiting for all the necessary records to be collected for a certain condition).
- Sending data to the final billing service.
- Bottom line instead of 10 thousand records, we sent one with the aggregating value of the Internet traffic counter. We made just one request to the database and saved a lot of resources, including electricity!
How to pacify the zoo?
To make it clearer how this works and where problems may arise here, let's take the Hewlett-Packard Internet Usage Manager prebilling system (HP IUM, in the updated version of eIUM) and use it as an example to see how such software works.
Imagine a large meat grinder where you throw meat, vegetables, loaves of bread everything you can. That is, there are a variety of products at the input, but they all take on the same shape at the output. We can change the grid and get a different shape at the output, but the principle and way of processing our products will remain the same-auger, knife, grid. This is the classic pre-billing scheme: data collection, processing, and output. In IUM prebilling, the links in this chain are called encapsulator, aggregator, and datastore.
Here it is necessary to understand that at the input we must have data completeness a certain minimum amount of information, without which further processing is useless. If a certain block or data element is missing, we get an error or a warning that processing is not possible, because operations cannot be performed without this data.
Therefore, it is very important that the equipment generates record files that have a strictly defined set and data type set by the manufacturer. Each type of equipment is a separate processor (collector) that works only with its own input data format. For example, you can't just upload a file from CISCO PGW-SGW equipment with mobile subscriber Internet traffic to a collector that processes the flow from Iskratel Si3000 fixed-line equipment.
If we do this, then at best we will get an exception during processing, and at worst we will have all the processing of a particular thread, since the collector handler will fall with an error and wait until we solve the problem with the file that is" broken " from its point of view. Here you can see that all pre-billing systems, as a rule, critically perceive data that a specific handler-collector was not configured to process.
Initially, the parsed data stream (RAW) is formed at the encapsulator level and can be transformed and filtered here. This is done if you need to make changes to the flow before the aggregation scheme, which should then be applied to the entire data stream (when it passes through various aggregation schemes).
Files (.cdr, .log, etc.) with records about the activity of subscriber users are received both from local sources and from remote ones (FTP, SFTP). other protocols are also possible. The parser parses files using different Java classes.
Since the prebilling system in normal operation is not designed to store the history of processed files (and there may be hundreds of thousands of them per day), after processing, the file on the source is deleted. For various reasons, the file may not always be deleted correctly. As a result, it happens that records from the file are processed repeatedly or with a long delay (when it was possible to delete the file). To prevent such duplicates, there are security mechanisms: checking for duplicates of files or records, checking for time in records, and so on.
One of the most vulnerable points here is data size criticality. The more data we store (in memory, in databases), the slower we process new data, the more resources we consume, and in the end, we still reach the limit after which we are forced to delete old data. Thus, auxiliary databases (MySQL, TimesTen, Oracle, and so on) are usually used to store this metadata. Accordingly, we get another system that affects the work of pre-billing with the resulting security issues.
What's in the black box?
Once upon a time, at the dawn of such systems, languages were used that made it possible to work effectively with regular expressions such as Perl. In fact, almost all prebilling, if we do not take into account working with external systems, is the rules of parsing and converting strings. Naturally, there is nothing better to find here than regular expressions. The ever-growing volume of data and increased criticality for the time of launching a new service to the market made the use of such systems impossible, since testing and making changes took a long time, and scalability was low.
Modern prebilling is a set of modules, usually written in Java, that can be managed in a graphical interface using standard copy, paste, move, and drag operations. Working in this interface is simple and straightforward.
It mainly uses a Linux or Unix based operating system, less often Windows.
The main problems are usually related to the process of testing or detecting errors, since data passes through many chains of rules and is enriched with data from other systems. Seeing what happens to them at each stage is not always convenient and understandable. Therefore, you have to look for the cause, catching changes in the necessary variables using logs.
The weakness of this system is its complexity and human factor. Any exception provokes data loss or incorrect data generation.
Data is processed sequentially. If we have an error exception at the input that does not allow us to correctly accept and process data, the entire input stream gets up or a portion of incorrect data is discarded. The parsed RAW stream goes to the next stage-aggregation. There can be several aggregation schemes, and they are isolated from each other. As if a single stream of water entering the shower, passing through the grate of the watering can, is divided into different streams some thick, others very thin.
After aggregation, the data is ready for delivery to the consumer. Delivery can go either directly to the databases, or write to the file and send it further, or simply write to the pre-billing storage, where they will lie until it is emptied.
After processing at the first level, data can be transferred to the second and further. Such a ladder is necessary to increase the processing speed and load distribution. In the second stage, another stream can be added to our data stream, mixed, shared, copied, combined, and so on. The final stage is always the delivery of data to the systems that consume it.
Pre-billing tasks are not included (and rightly so!):
- monitor whether input and output data has been received and delivered-this should be handled by individual systems;
- encrypt data at any stage.
Privacy policy
Here we have a complete split! Let's start with the fact that data protection is not part of the prebilling task in principle. It is necessary and possible to differentiate access to prebilling at different levels (management interface, operating system), but if we force it to encrypt data, the complexity and processing time will increase so much that it will be completely unacceptable and unsuitable for billing.
Often, the time from using the service to displaying this fact in billing should not exceed several minutes. As a rule, the metadata that is needed for processing a specific piece of data is stored in a database (MySQL, Oracle, Solid). Input and output data are almost always located in the directory of a specific collector stream. Therefore, anyone who is allowed to do so can have access to them (for example, the root user).
The prebilling configuration itself with a set of rules, information about database access, FTP, and so on is stored in encrypted form in a file database. If you don't know the username and password for accessing prebilling, then uploading the configuration is not so easy.
Any changes made to the processing logic (rules) are recorded in the prebilling configuration log file (who changed it, when, and what).
Even if data is transmitted directly through the chains of collector handlers inside prebilling (without uploading it to a file), the data is still temporarily stored as a file in the handler directory, and you can access it if you want.
Data that is processed on prebilling is depersonalized: it does not contain full names, addresses, or passport data. Therefore, even if you get access to this information, you can't find out the subscriber's personal data from here. But you can catch some information by a specific number, IP or other identifier.
With access to the prebilling configuration, you get data to access all the adjacent systems that it works with. As a rule, access to them is restricted directly from the server where prebilling works, but this is not always the case.
If you get to the directories where the file data of handlers is stored, you can make changes to these files, which are waiting to be sent to consumers. These are often just plain text documents. Then the picture is as follows: prebilling received and processed the data, but they did not arrive in the final system they disappeared in a "black hole".
And it will be difficult to find out the reason for these losses, since only part of the data is lost. In any case, you will not be able to emulate the loss if you continue to search for the causes. You can view the input and output data, but you won't be able to figure out where it went. The attacker can only cover his tracks in the operating system.