Skip to end of metadata
Go to start of metadata

Batch Log Level Data

Batch Log-level Data (LLD) allows you to retrieve and track feeds of log-level event data that include dimensions not available in the AppNexus Console or via the API Report Service in a batch-processed manner. Although this data has a higher latency than Streaming Log-level Data, it is more accurate. Feeds are generated hourly and are split into one or more files (see File Formats section below). The format of the file you receive will depend on what you specified when you subscribed (e.g., Avro, Protobuf, Protobuf-delimited).

For general information about Log-Level Data and a comparison with the Streaming LLD product, see Log-level Data feeds.

File Formats & Schemas

You may specify one or more of following formats when subscribing to the service.

All formats (Protobuf, Protobuf-delimited or Avro formats) offer automatic push to the LLD Cloud Export options offered by AppNexus.

Use the downloads provided below for packaged example files and code for consuming Log-Level Data files.

Example files are created to assist you when testing the implementation you will use to consume Log-Level Data files. To ease testing, the example files are somewhat simpler than the generated files you will retrieve in production:

  • Example files for the protobuf format are not compressed (in production, they are Snappy compressed)
  • Example data does not contain values that are typical for a given column. Instead, columns are populated with the column's index number converted to the column's type.
VersionDate ReleasedSchemas ZipExample Files ZipExample Code Zip (includes schemas + example files)Notes
0.5.2November 7, 2019DownloadDownloadDownload

Added new external_campaign_id field to the Standard Feed in LLD. This new optional field should only appear to sellers on resold impression rows. The value of this field is passed in via the cid field on a DSP's bid. Since the cid field is optional, the new external_campaign_id field will only have data when the external DSPs populate it on their bid(s).

See the Open RTB specification for more info on the cid field.

0.5.1September 10, 2019DownloadDownloadDownload
  • Added partner_fees to Standard Feed.
  • Added partition_time_millis field to all feeds to simplify the loading and partitioning of data into databases.
  • Added hashed_user_id_64 field to Conversion Pixel and Segment feeds for clients who only want anonymized personal data.
0.4.4May 29, 2019DownloadDownloadDownloadAdded tc_string to standard feed.
0.3.4April 10, 2019DownloadDownloadDownloadAdded split_id to standard feed.
0.3.3April 5, 2019DownloadDownloadDownload
  • Added hashed_user_id_64 to segment feed.
  • Added hashed_user_id_64, latitude_trunc, longitude_trunc to standard untransacted feed.
0.3.0October 4, 2018 Download Download Download
  • Added Avro schemas and example files
  • Added partition_time_millis column to all feeds for partition filtering per record
0.2.4August 22, 2018DownloadDownloadDownload
  • Removed hashed_device_unique_id from standard feed schema (no longer used)
  • Added schema for standard untransacted feed
0.1.9April 12, 2018DownloadDownloadDownload
  • Allow specifying protobuf version, e.g. -Dprotobuf.version="2.5.0"
  • Patched schema class finder to work with proto3 generated code
0.1.8March 16, 2018DownloadDownloadDownload

Added proto2 syntax hint to make proto files compilable with proto3

0.1.7March 12, 2018DownloadDownloadDownload

Protobuf-delimited is now GZIP-compressed, updated sample code accordingly

0.1.6March 7, 2018DownloadDownloadDownload

Added anonymized personal data fields to various relevant LLD schemas

0.1.4December 8, 2017DownloadDownloadDownload

Added imps_for_budget_caps_pacing column to standard_feed

0.1.3October 16, 2017

Download

Download

Download

Initial release

Protobuf (Sequence file wrapped Protocol Buffers)

Only version 2.5.0 of protobuf is currently supported.

Files are Snappy compressed Hadoop Sequence files where the value for each record is a BytesWriteable, the payload of which is an encoded Protocol buffer message.

All schemas specify that fields are optional and null values are unset fields in the protobuf message. See the individual feeds under Log-level data feeds for the conditions that cause a field's value to be null and for more details on column availability.

See Protobuf Install and Configuration for instructions on how to install and configure the protobuf compiler and to download a project that includes the schemas and sample code.

Protobuf-delimited (Protocol Buffers)

Only version 2.5.0 of protobuf is currently supported.

Files are GZIP compressed files that contain length-delimited Protocol buffer messages. Each record is a varint specifying the length of the message, followed by the protobuf message itself. One reason to use our protobuf-delimited format instead of our protobuf format is that reading protobuf-delimited files does not require Hadoop or Hadoop with native Snappy support.

All schemas specify that fields are optional and null values are unset fields in the protobuf message. See the individual feed service pages for the conditions that cause a field's value to be null and for more details on column availability.

See Protobuf Install and Configuration for instructions on how to install and configure the protobuf compiler, and to download a project that includes the schemas and sample code.

Avro

Avro is a data serialization framework that bundles schemas with data. For compression, the DEFLATE codec (level = 1) is used. For more details, see https://avro.apache.org/docs/current/.

Unlike in our protobuf formats, null values are never used. Missing or unset fields are encoded with their default values, as specified in the feed schema.

Avro is offered for simpler integration with existing third-party Cloud systems. Due to incompatibilities found while testing integrations, fields that are enums in protobuf are sent as Avro ints.

Batch Data Retrieval options

AppNexus provides two ways to receive batch LLD files.

Cloud Export

Cloud Export is a push-based mechanism in which batch generated files are uploaded directly to your cloud storage location (e.g., S3 bucket, Azure container, Google bucket). See Log Level Data - Cloud Export for more details.

  • No labels