Skip to content

Kafka Catalog

This catalog provides an overview of the Kafka topics used in the Data and Content Delivery Kafka Cluster, including their associated Avro schemas. The schemas define the message format for each topic and are registered in a Schema Registry. All topics use FORWARD TRANSITIVE compatibility checking, i.e. schema evolution within a topic is allowed as long as forward compatibility rules are respected, i.e. new schemas must be readable by consumers using any of the previous schema versions.

Catalog Overview

macrodata
marketpage
statisticapi

Topics

macrodata

Time-series indicator data by geography.

Schema: MacroData

Fields:

  • id (uuid(string))
  • title (string)
  • description (string)
  • geo (Geo)
    • name (string)
    • code (string)
    • level (int)
  • coveredTimeframe (CoveredTimeframe)
    • displayText (null | string) — An optional human readable text to display the covered timeframe
    • start (string) — The start date of the time span covering the data
    • end (string) — The end date of the time span covering the data
  • unit (Unit) — Detailed explanation of what the indicator measures.
    • id (uuid(string))
    • name (string)
  • sector (Sector)
    • id (uuid(string))
    • name (string)
  • populationGroup (PopulationGroup)
    • id (uuid(string))
    • name (string)
  • adjustment (Adjustment)
    • id (uuid(string))
    • name (string)
  • data (array)
    • startDate (date(int))
    • endDate (date(int))
    • value (double)
    • updatedAt (timestamp-micros(long))

marketpage

Schema: MarketPage

Fields:

  • id (uuid(string))
  • language (string)
  • title (string)
  • url (string)
  • geo (Geo)
    • name (string)
    • code (string)
    • level (int)
  • industry (Industry)
    • id (uuid(string))
    • legacyId (long)
    • parentId (null | uuid(string))
    • name (string)
    • definition (string)
    • inScope (null | string)
    • outOfScope (null | string)
    • hierarchy (array)
      • id (uuid(string))
      • name (string)
  • coveredTimeframe (CoveredTimeframe)
    • displayText (null | string) — An optional human readable text to display the covered timeframe
    • start (string) — The start date of the time span covering the data
    • end (string) — The end date of the time span covering the data
  • indicators (array)
    • id (uuid(string))
    • title (string)
    • description (string)
    • source (string)
    • url (string)
    • coveredTimeframe (CoveredTimeframe)
    • unit (string)
    • data (array)
      • startDate (date(int))
      • endDate (date(int))
      • value (double)
      • updatedAt (timestamp-micros(long))
  • textboxes (map)
    • title (string)
    • url (string)
    • text (string)
    • updatedAt (timestamp-micros(long))

statisticapi

Statistic API dto from Numera

Schema: StatisticApi

Fields:

  • numeraId (string) — The id of the statistic provided by numera as a uuid
  • statista35Id (string) — The id of the statistic provided by the legacy Content Tools as a sequence
  • title (string) — The title is also known as the headline
  • subtitle (string) — The subtitle is also known as the catchline (h1)
  • description (string) — A vivid description of the statistic. It can contain html tags. For premium statistics the facts are wrapped in a span: fact
  • urlPath (string) — The url path is language / platform specific and contains the complete url path except for the host, for example: "/statistics//headline-slug/"
  • released (timestamp-micros(long)) — The timestamp of the publication
  • updated (timestamp-micros(long)) — The updated timestamp shows, when any field of the statistic has been updated
  • contentUpdated (timestamp-micros(long)) — The content updated timestamp indicates that the content of the statistic has been updated and not only some metadata, also the statistic has been republished
  • premium (boolean) — Indicates if the statistic is in the paid plan or freely accessible
  • language (string) — The language the texts are written in, also indicates on which platform the statistic can be found
  • coveredTimeframe (CoveredTimeframe) — The covered timeframe contains the time span the data covers
    • displayText (null | string) — An optional human readable text to display the covered timeframe
    • start (string) — The start date of the time span covering the data
    • end (string) — The end date of the time span covering the data
  • coveredGeos (CoveredGeos) — The covered geo locations indicate to which region the data is interesting for
    • level1 (array) — Regional level, e.g. North America, OECD. There is no ISO Standard for the code. For a shared taxonomy please use the codes in the geos_level_1.json file.
      • code (string) — A common statista maintained geo code to reference a specific geo location
      • name (string) — An english name for the geo location
      • displayName (null | string) — A language specific name for the geo location
    • level2 (array) — Country level, e.g. Germany, Australia. Includes disputed such as Taiwan.
    • level3 (array) — Subnational, e.g. Bayern, California
  • mainTags (null | array) — A list of more relevant tags
  • tags (null | array) — A list of less relevant tags
  • author (null | Author) — The author of the statistic. May be unset, if the author should not be published
    • fullName (string) — The full name of the author in the format ' '
    • statista35Id (string) — The id of the author provided by the legacy Content Tools as a sequence
  • complianceType (string) — The compliance type indicates, how the data has been gathered. One of: 'OnlineStorePurchase', 'IndividualPurchase', 'OfflineStorePurchase', 'GovernmentUniversityPurchase', 'Registration', 'Cooperation', 'FreelyAccessible', 'PublicationRevoked', 'DataCoopFreelyAvailable', 'DataCoopExclusive', 'DataCoopRegistration', 'License', 'StatistaSurvey'
  • industries (array)
    • statista35Id (string) — The id of the industry provided by the legacy Content Tools as a sequence
    • name (string) — The human readable name of the industry, provided in the language of the statistic
    • subindustry (null | StatisticIndustry) — The subindustry if any. Industries are organized as tree. We expect 3 layer at max.
  • sourceInfo (SourceInfo)
    • sources (null | array) — One or multiple source the data is coming from.
      • statista35Id (string) — The id of the source provided by the legacy Content Tools as a sequence
      • title (string) — The name of the source
      • subtitle (null | string) — An optional subtitle of the source
      • legalNotice (null | string) — A legal notice for the source if any
      • active (boolean) — Indicates, if the source is still active. For example a website, which is no longer available is not active.
    • conductors (null | array) — One or multiple conductors of the survey. Also known as charger.
    • publishers (null | array) — One or multiple publisher of the survey.
    • sourceLink (null | string) — A direct link to the source, where the data / survey is coming from
    • sourceLinkTitle (null | string) — A human readable title for the source link. For example 'Washington Post-ABC News poll'
  • data (DataGrid) — The actual data points organized row based
    • columns (array) — An array of column records
      • columnType (string) — The type of the column. One of: 'text' and 'number'
      • label (null | string) — A free text label of the column
    • rows (array) — An array of rows
      • values (array) — The values of a row. The first item of the row mostly contains the header for the row.
  • display (BarChart | LineChart | LineZoomableChart | LineWithAnnotationsChart | PieChart | DualYAxesChart | StackedAreasChart | StackedBarsChart | TableChart) — The display field provides a suggestion made by the author on how to render the given data
    • chartType (string) — The chartType for BarChart is always 'bar' and can be used as discriminator
    • columnColor (null | array) — An array containing hex rgb color values, if there are custom colors to colorize the bars
    • xAxisDescription (null | string) — An optional x axis description, which describes the attributes
    • yAxisDescription (null | string) — An optional y axis description, which describes the values
    • unit (null | string) — An optional unit for all the number values in the data grid rows
    • orientation (enum(BarChartOrientation)) — The orientation of the bar chart
    • hideValues (boolean) — Indicates, if the actual values should be displayed or hidden on the data points
    • hideYAxis (boolean) — Indicates, if values of the y axis should be hidden
    • hideMaxValue (boolean) — Indicates, if the maximum value on the y axis should be hidden
  • statisticType (string) — Describes the type of the statistic. One of 'STATISTIC', 'SURVEY', 'SMI_STATISTIC', 'CPI', 'FORECAST', 'BLOCKED_STATISTIC'
  • pageTitle (string) — The title for the web page
  • surveyType (null | string) — Only available if the statisticType is 'SURVEY', also then its optional. Then this field indicates the type of the survey in language specific name, like 'Face-to-face interview'
  • specialProperties (null | string) — A free text to add some information about the data or data source. Also known special characteristics.
  • surveyName (null | string) — A free text to give the survey a title
  • ageGroup (null | string) — A free text, which states which age group has been interviewed
  • numberInterviewed (null | string) — A free text, containing how many people have been interviewed, may contain more than just a number
  • seoKeywords (null | string) — Search Engine Optimization keywords
  • seoMetaDescription (null | string) — Search Engine Optimization meta description
  • supplementaryNotesHtml (null | string) — Supplementary notes or support text. Mostly it contains the specific question asked in a survey. Can contain html tags.
  • extendedReadingSupport (boolean) — Indicates, if the description is long and a extended reading support button should be shown
  • recommendedReports (null | array)
    • statista35Id (string) — The id of the statistic provided by the legacy Content Tools as a sequence
    • title (string) — The title of the report in the language of the report