Anura Anti-Fraud, Mobfox Exchange, Global CIDR Blacklist, Geopatching, New Logging - Kinesis

Last updated: April 12, 2019

Overview

This release of RTB4FREE has several new capabilities:

  1. Support for Anura Anti Fraud
  2. A new exchange - Mobfox
  3. Use a global CIDR blacklist
  4. Geo patching missing device.geo daya
  5. Better definition of loggong and Amazon Kinesis logging

While the new features have impacts on configuration, the defaults set up within the system should make current deployments work without making any changes.

Specifically, the logging environment variables have changed in the docker deployments. The new configuration will be more flexible; however, your current set ups using Kafka should continue to work, so long as you specify the BROKERLIST in your startup of the bidder.

Anura Anti-fraud

Anura anti-fraud is implemented in the bidder. It is configured in Campaigns/payday.json as follows:

  "fraud" : {
    "type" : "$FRAUDTYPE",
    "threshhold" : "$FRAUDTHRESHOLD",
    "ck" : "$FRAUDKEY",
    "endpoint" : "$FRAUDENDPOINT",
    "bidOnError" : "$FRAUDBIDONERROR",
    “watchlist”:   “$FRAUDWATCHLIST”
  },

Setting these values is done in the Docker compose file environment as:

 bidder:
    image: ...
    environment:
      FRAUDTYPE: "Anura"
      FRAUDTHRESHOLD: "10"
      FRAUDKEY: "yourfraudkeyhere"
      FRAUDENDPOINT: ""
      FRAUDBIDONERROR: "false"
	...

Here is an explanation of all the fields:

  • FRAUDTYPE: Defined as Anura. Note, if this is omitted or is blank, no fraud checking is done.
  • FRAUDTHRESHOLD: Defined as the percentage of IP addresses to check after all other constraints are checked.
  • FRAUDENDPOINT: Alternative endpoint, if blank or missing, then default Anura endpoint is chosen/
  • FRAUDBIDONERROR: If the Antifraud system fails, then should we bid, here, set to no.
  • FRAUDWATCHLIST: Note, this is not used for Anura.

When properly configured the startup log looks like this:

2019-03-20 14:02:31 INFO  Configuration:539 - *** Fraud detection is set to Anura

The performance penalty for a non-cached entry is less than 1 ms, (unloaded). Make sure your endpoint is close, or performance will be severely degraded.

Mobfox Exchange

A new Exchange is added, Mobfox. It is automatically configured, you don't need to make any changes to your system to use it. However, be aware the Mobfox endpoint to the bidder is /rtb/bids/mobfox

CIDR Blacklist

A master blacklist of CIDR formatted IP addresses has been implemented in the bidder. A master blacklist can be defined in the bidder by specifying the filename of the $MASTERCIDR environment variable in the Docker compose file, and further, mounting that filename. The actual file is in CIDR form, and resides in the directory ./data/blacklist.txt. The format of the looks like this:

# IP ranges identified as being used by the "@MASTERCIDR" object in RTB4FREE
45.33.224.0/20
45.43.128.0/21
45.43.136.0/22
...

Note, comments begin with # in the first column.

Example Docker compose entries:

	environment:
	  - MASTERCIDR: “data/blacklist.txt
	volumes:
        - “./home/ubuntu/data:/data

The file in the docker container is /data/blacklist.txt. It is mounted to the file /home/ubuntu/data/blacklist.txt on the host. If you edit or replace this file you need to restart the bidder to reload it. When initialized properly you will see 2 log entries similar to this:

						 
2019-03-20 14:10:00 INFO  Configuration:1367 - *** Configuration Initialized @MASTERCIDR with /data/blacklist.txt

And:

2019-03-20 14:10:05 INFO  Configuration:483 - *** Master Blacklist is set to: com.jacamars.dsp.rtb.blocks.NavMap@52066604

The performance penalty for doing a CIDR lookup is 250 microseconds.

GEOPATCH

Many times, especially on desktop based bid requests using site objects - the device.geo field is not always filled in. This removes a lot of potential traffic if you do geo targeting. However, you can have the bidder patch these fields for you if you set up the "GEOPATCH".

This feature is implemented in the bidder. The MAXMIND CITY db provides support for when the Bid Request is lacking device.geo, or, if device.geo is missing any of these fields: city/region/country/zipcode. We call this feature “geopatching” A geopatching MMDB is defined in the bidder by specifying the filename of the $GEOPATCH environment variable in the Docker compose file, and further, mounting that filename. The actual file is a MAXMIND Geo form file, and resides in the directory ./data/maxmind.mmdb.

You can download a copy of the maxmind MMDB file from here However make sure you download the "city" database. An example entry in the docker compose file, looks like:

				 
	environment:
	  - GEOPATCH: “data/GeoLit2-City.mmdb”
	volumes:
        - “/home/ubuntu/docker/data:/data”

Inside the container there is an entry that looks like:

						
	GEOPATCH: “$GEOPATCH”,

By default, this value is “”, or, no geopatching. The file in the docker container is /data/GeoLite2-City.mmdb”. It is mounted to the file /data/GeoLite2-City.mmdb” on the host. If you replace the file GeoLite2-City.mmdb, you need to restart the bidder to reload it. Note, you must use the “City” database. Other databases will fail. When properly configured the startup log will look like similar to these:

					
2019-03-20 14:10:00 INFO  Configuration:1367 - *** Configuration Initialized @ISO2-3 with data/adxgeo.csv

And:

2019-03-20 13:52:34 INFO  Configuration:587 - *** GEOPATCH DB set to: /data/GeoLite2-City.mmdb ***

Note: the geopatching depends on the loading of the file “adxgeo.csv” at startup to support the translation of ISO2 country names to ISO3. Maxmind used ISO2, RTB requires ISO3. This file “adxgeo.csv” is loaded by default and is stored in every docker container.

The performance penalty for adding geo into the RTB bid request costs about 2ms.

Logging Changes

In the previous versions of RTB4FREE the docker environment variables for logging where referenced directly in the docker-compose file. An example is shown below:

  "zeromq" : {
    
      "bidchannel" : "kafka://[$BROKERLIST]&topic=bids",
      "winchannel" : "kafka://[$BROKERLIST]&topic=wins",
      "requests" : "kafka://[$BROKERLIST]&topic=requests",
      "clicks" : "kafka://[$BROKERLIST]&topic=clicks",
      "pixels" : "kafka://[$BROKERLIST]&topic=pixels",
      "videoevents": "kafka://[$BROKERLIST]&topic=videoevents",
      "postbackevents": "kafka://[$BROKERLIST]&topic=postbackevents",
      "status" : "kafka://[$BROKERLIST]&topic=status",
      "reasons" : "kafka://[$BROKERLIST]&topic=reasons",

In this setup, you were set using Kafka. If you wanted to use any other logging RTB4FREE supports (like files, or pipes, etc.) you had to modify payday.json on your host machine and then map that file to Campaigns/payday.json in the container. Now we have a different setup and the payday.json file uses environment variables like so:

  "zeromq" : {
      "bidchannel" : "$BIDSCHANNEL",
      "winchannel" : "$WINSCHANNEL",
      "requests" : "$REQUESTSCHANNEL",
      "clicks" : "$CLICKSCHANNEL",
      "pixels" : "$PIXELSCHANNEL",
      "videoevents": "$VIDEOEVENTSCHANNEL",
      "postbackevents": "$POSTBACKEVENTSCHANNEL",
      "status" : "$STATUSCHANNEL",
      "reasons" : "$REASONSCHANNEL",

Now you can directly specify the logging from the environment in the docker-compose file.

But note, the defaults for each of these maps to the previous versions kafka specifications. So for example, $BIDCHANNEL defaults to:

kafka://[$BROKERLIST]&topic=bids

All you need to do to keep using Kafka is just specify BROKERLIST in the docker-compose file, just like you always did.

Docker Environment Variables

In the new scheme of setting up logging, if you want to use something other than Kafka, you need to specify it with the appropriate value in the docker-compose file. For example, if you wanted to log bids to a file then the environment for it in the docker-compose would look like this:

 bidder:
    image: "jacamars/rtb4free:v1"
    environment:
      GDPR_MODE: "false"
      
      BROKERLIST: ""
      BIDSCHANNEL: "file://bids.json"

Of course, this file: ./bids.json maps to "bids.json" inside the container. If you want to keep the file on the local host you need to map this file on the host using the volume Docker directive.

This is the list of environment variables and their defaults:

  • $BIDSCHANNEL - kafka://[$BROKERLIST]&topic=bids
  • $WINSCHANNEL - kafka://[$BROKERLIST]&topic=wins
  • $REQUESTSCHANNEL - kafka://[$BROKERLIST]&topic=requests
  • $CLICKSCHANNEL - kafka://[$BROKERLIST]&topic=clicks
  • $PIXELSCHANNEL - kafka://[$BROKERLIST]&topic=pixels
  • $VIDEOEVENTSCHANNEL - kafka://[$BROKERLIST]&topic=bids
  • $POSTBACKEVENTSCHANNEL - kafka://[$BROKERLIST]&topic=bids
  • $STATUSCHANNEL - kafka://[$BROKERLIST]&topic=status
  • $REASONSCHANNEL - kafka://[$BROKERLIST]&topic=reasons
  • $LOGCHANNEL - kafka://[$BROKERLIST]&topic=logs

Default Kafka Logging

All you need to do to keep using Kafka is just specify BROKERLIST in the docker-compose file, just like you always did. All of the new environments default to the previous Kafka specifications.

Kinesis Logging

The latest logging capability we offer is the ability to log using Amazon Kinesis. In order to use Kinesis logging you need to have the following before hand:

  1. Value of your AWS access key. For example sake, let's say this is "AZIQSQ1223921".
  2. Value of your AWS secret key. For example sake, let's say this is "TYSSYYQ12891891-WE19"
  3. The region of where your kinesis logging will be done. For example sake, let's say this is "US-East-1"

With these in hand are ready to set up the Docker environment variables for Amazon Kineses access:

 bidder:
    image: "jacamars/rtb4free:v1"
    environment:
      GDPR_MODE: "false"
      
      AWSACCESSKEY: "AZIQSQ1223921"
      AWSSECRETKEY: "TYSSYYQ12891891-WE19"
      AWSREGION: "US-East-1"
      BROKERLIST: ""
      BIDSCHANNEL: "file://bids.json"

That's the setup for the Amazon portion. We could deleted BROKERLIST, since it's not really going to be used. Now we add in the Kineses definitions for all the logger topics:

  • $WINSCHANNEL - kafka://[$BROKERLIST]&topic=wins
  • $REQUESTSCHANNEL - kafka://[$BROKERLIST]&topic=requests
  • $CLICKSCHANNEL - kafka://[$BROKERLIST]&topic=clicks
  • $PIXELSCHANNEL - kafka://[$BROKERLIST]&topic=pixels
  • $VIDEOEVENTSCHANNEL - kafka://[$BROKERLIST]&topic=bids
  • $POSTBACKEVENTSCHANNEL - kafka://[$BROKERLIST]&topic=bids
  • $STATUSCHANNEL - kafka://[$BROKERLIST]&topic=status
  • $REASONSCHANNEL - kafka://[$BROKERLIST]&topic=reasons
  • $LOGCHANNEL - kafka://[$BROKERLIST]&topic=logs
  •  bidder:
        image: "jacamars/rtb4free:v1"
        environment:
          GDPR_MODE: "false"
          
          AWSACCESSKEY: "AZIQSQ1223921"
          AWSSECRETKEY: "TYSSYYQ12891891-WE19"
          AWSREGION: "US-East-1"
          
          BIDSCHANNEL: "kinesis://aws_access_key=$AWSACCESSKEY&aws_secret_key=$AWSSECRETKEY&records=10&sleep=250&topic=bids"
          WINSCHANNEL: "kinesis://aws_access_key=$AWSACCESSKEY&aws_secret_key=$AWSSECRETKEY&records=10&sleep=250&topic=wins"
          REQUESTSCHANNEL: "kinesis://aws_access_key=$AWSACCESSKEY&aws_secret_key=$AWSSECRETKEY&records=10&sleep=250&topic=requests"
          PIXELSSCHANNEL: "kinesis://aws_access_key=$AWSACCESSKEY&aws_secret_key=$AWSSECRETKEY&records=10&sleep=250&topic=pixels"
          VIDEOEVENTSSCHANNEL: "kinesis://aws_access_key=$AWSACCESSKEY&aws_secret_key=$AWSSECRETKEY&records=10&sleep=250&topic=events"
          POSTBACKEVENTSSCHANNEL: "kinesis://aws_access_key=$AWSACCESSKEY&aws_secret_key=$AWSSECRETKEY&records=10&sleep=250&topic=postback"
          STATUSCHANNEL: "kinesis://aws_access_key=$AWSACCESSKEY&aws_secret_key=$AWSSECRETKEY&records=10&sleep=250&topic=status"
          LOGSSCHANNEL: "kinesis://aws_access_key=$AWSACCESSKEY&aws_secret_key=$AWSSECRETKEY&records=10&sleep=250&topic=logs"
    

    Taking a look at the specification you can see the log components; here are their meanings:

    • aws_access_key - This is the account access key. We use the $AWSACCESSKEY value so we can pass that in also as an environment value. This reduces typing and cuts down on mistakes.
    • aws_secret_key - This is the account secret key. We use the $AWSSECRETKEY value so we can also pass this value in as an environment variable.
    • records - This is the number of records to buffer at once. Defailt is 10 records
    • sleep - This is the number of milliseconds to wait between writes. Default is 250 ms.
    • topic - This is the Kinesis stream name. Note, if the stream does not exist, the logger will attempt to create it for you.