RTB4FREE Details (Layer 3 - Data Management Platform)

Last updated: April 18, 2017

Overview

This is the Data Management Platform (DMP), the third layer of the DSP. Layer 1 is the Bidder, Layer 2 is the Campaign Manager and Layer 3 is the Data Management Platform. Features at a glance:

  1. Integrated with the open source Elasticsearch, Logstash, Kibana (ELK) stack.
  2. Search and view RTB transaction log details for requests, bids, wins, pixels, clicks and costs.
  3. Search and view RTB application log details for bidder and crosstalk.
  4. Out-of-the-box dashboard reports in Kibana.
  5. Standard campaign metrics viewable in the Campaign Manager.
  6. Create custom reports using Kibana.
  7. Open architecture allows integration with your own data analytics infrastructure.
  8. Standard data backup and maintenance processes included.
  9. Scale up horizontally by deploying clusters for ELK and Kafka.
  10. Commerical support and customization is available from this link

Because this is a Docker deployment you must have a working knowledge of Docker. You need Docker and Docker docker-compose installed. For information on Docker , look here.

System Configuration

All bidder logs are sent to Kafka, as shown in the following diagram.



The system configuration depends on many parameters, including

  1. Bidder transaction rate (ie, queries per second).
  2. Performance of infrastructure.
  3. Available disk space.
  4. Historical data retention required.
  5. Granularity of data retained.

The RTB4FREE architucture can scale to support high data processing rates by horizontally scaling any of the following components.

  1. Bidder servers can be added as required to support additional SSP rates.
  2. Kafka, which is used for transport of data between bidders and data stores, can be clustered.
  3. Kafka consumers (such as Logstash) can be partitioned to share data transfer responsibilities.
  4. Elasticsearch can be clustered to accomodate increased data rates or storage requirements.
A detailed system configuration can be determined after understanding the detailed on the requirements.

Demo With PWD

Try the stand-alone DMP in the Docker Playground below:

Try in PWD

Once the containers start, you can access Kibana at the container's IP address on port 5601. The demo Elasticsearch database has been preloaded with data so you can see how RTB logs appear in Kibana visualizations.

Kibana Visualizations

The initial Kibana view is Discover.

The RTB log records can be searched by selecting the corresponding index from the dropdown list, then entering a Lucene query in the seach input box. The following record types can be selected.

  1. Requests. Requests for bids received from SSP exchanges.
  2. Bids. Bid responses to requests. The bidder will respond if a request matches a campaign.
  3. Wins. Winning bids from the exchange
  4. Pixels. Impressions that have been served, typically signaled from a pixel image.
  5. Clicks. Click events.
  6. Postback Events. Additional user defined events, such as video events, mobile application loads, etc.
  7. Reasons. If a campaign has not bid on a request, the reason is logged here. This helps tune campaign paramters to increase win rates.
  8. RTB Logs. Bidder application log, used to determine health status of bidders.
  9. Stats. Bidder usage metrics, used to monitor bidder performance.

Custom Kibana Reports

Kibana contains a report builder that let's you create various display widgets, such as charts, graphs, maps, heat maps, tag clouds, etc. A sample time series graph showing request rate over time is shown below.

You can combine various visualizations into a dashboard. This dashboard shows bidder transaction activity as well as bidder performance metrics in a single view. Any data in the logs can be visualized with these tools.

Campaign Manager Reports

Data logged into Elasticsearch can be made available to users of the Campaign Manger. The following Campaign Manager view shows each campaign's usage using data extracted from Elasticsearch.

Custom Data Stores

If you have special processing needs, you can build you own app to consume and process the data. An easy method is to build your own Kafka consumer and subscribe to the data topics you wish to process.

The open source secor project let's you read any of the RTB logs and store them in cloud storage, such as Amazon S3 or Google Cloud Storage. This option offers unlimited storage without worrying about disk space. You can then download the data for offline data analysis using tools like Hadoop.

If you need to process analytics in real time, a custom Kafka consumer can read the log stream and perform real-time analysis and store in your own database.

Source Code

The RTB4FREE source code for all the services is located here.