Data footprint

Designing our bot to have an effective data footprint is a good idea from many perspectives. It lets us query the conversational log data much faster, puts a number of good practices into focus, and it makes us conscious of what we're storing in the conversational log data, which helps from a privacy perspective.

data footprint picture

Designed for Conversational Data

Teneo Inquire — which is the analytics and data part of the Teneo platform — along with all of Teneo, is built to perform well on conversational data. Teneo is not designed to to store large chunks of binary data or large and unique JSON structures.

Typical conversational data sessions in the Teneo platform are normally in the range of 50kB to 400kB.

Very large sessions, e.g. really long sessions with many turns of dialog, can be slightly bigger at 400kB to 800kB.

Teneo Inquire scaled typical conversational data well with traffic, meaning it scales well in regards to API calls per sessions.

Teneo Inquire scales less well with data outside of its purpose, including,

  • Conversational log data which includes a large number of non-conversational data, such as big JSON payloads or large binary objects.
  • Really large conversational log data sessions which are outside of the normal spans. This is often an indicator that the bot includes large chunks of non-conversational data.

Solution data footprint is key

An effective solution data footprint is a key indicator of a well designed bot. It also greatly impacts the performance of Teneo Inquire, which affects how fast we can query our conversation log data and how quickly Teneo Studio is able to give us feedback in e.g. the Optimization section.

Good practices

Here are some good practices when working on your Data Footprint.

Solution design

  • Avoid sending in large 'blobs' or strings representing objects of data as these are really costly. Instead, use integrations to call web services when you need to retrieve this data.

Teneo Inquire

  • Use Adorners! Adorners can be used to copy variables from event level to session level, which means that they will be faster to query for. You can read more about Adorners in the documentation and here in the Developers page.
  • Use Aggregators! Aggregators are used to aggregate data, for example the amount of traffic towards the bots' key flows. These are incredibly fast to query against, and can be used to power dashboards. You can read more about Aggregators in the documentation and here in the Developers page.
  • Use Sample! When your bot is successful and your datasets grow larger, TQL queries will take time to run. To quickly design queries, you can use the sample command to ask Teneo to run your query over a small subset of sessions and return results. Read more about sampling in the TQL Reference.
    • You can also use limit, but if the number of hits is small or if a mistake is made in the TQL query, you might wait a long while for no results. Read more about limit in the TQL Reference and in the TQL Reference Guide in the Developers pages.
  • When you are working on reporting and analytics, it's a good idea to work on your Teneo Query Language queries in Teneo Studio as you have much more support there. However, it is recommended to use the Teneo Inquire Client to run long running queries.
    • Teneo Studio also gives you the possibility of sharing queries, which is a perfect way to save commonly used queries.
    • You can publish queries, which are then easy to retrieve using the Teneo Inquire Client.
  • Do not wait to set up efficient reporting - do it already in sprint 1 and extend it as you go! This will make sure things are done right from the beginning.

Further reading

Further reading can be found in the Forum, where you can ask questions to a Teneo Developer.

Was this page helpful?