************************ Providing Data to Slices ************************ .. include:: ../../complete.rst Now that we know how to add slices to a stack, we need to know how to provide data to those slices. There are two ways to do this, the first is to provide a json fixture file that provides the data in the proper response format. We'll talk about this more in a later section. The second method, which is more common is to build a Python module that has a class for each slice to provide the data needed for the slice. These Python modules all follow the same basic structure as shown here: # Setup # Data Tables # Recipe Service # Filter Service # Slice Services The setup section is where we import all the needed components and establish database connections. Next, we define all the tables that hold the data we will be using in our slices. With our tables defined, we can build a Recipe Service that houses all our data "ingredients". (Don't worry about all these new words, we will define and explore them all in future sections.) Finally, we build classes based off our Recipe Service that provide the data for each slice. .. _label_setup: Setup ===== In the setup section, we need to import all the data types and functions we use in our Tables and slices. You will always need ``Table`` and ``Column``. If you are going to sum or count things, you will need ``func`` as well. After that you will need to import the data types present in your table, a full list of these is available in the `SQLAlchemy Docs `_ .. code-block:: python from sqlalchemy import func, Table, Column, String, Float Next, we'll import the declarative base, which provides a container to store all the metadata about our data tables. This is used by the recipes we'll talk more about soon. .. code-block:: python from sqlalchemy.ext.declarative import declarative_base Most Juicebox stacks require a connection to Redshift, a type of database, to access their data. We've built an easier way to use a preconfigured engine to create connections. .. code-block:: python from dataservices.redshift_connectionbase import redshift_create_engine When we start creating data services to power our slices, we are going to use a recipe, which is a way of defining how we get and use data in our slice. All recipes are built from metrics and dimensions. A metric is a single unit of data that we are displaying in the slice. Dimensions are just like metrics, except that they are used to group or aggregate data in our slice. Recipes can also have filters that are used to limit the data output in our slice. More on this later, but for now we just need to import these components so we can use them in our slices. .. code-block:: python from dataservices.recipe import Metric, Dimension, Filter Recipes are part of the secret sauce that makes Juicebox awesome. The RecipeServiceBaseV3 is the default object that all data services inherit from. It provides most of the services needed by a Juicebox data services. .. code-block:: python from dataservices.servicebasev3 import RecipeServiceBaseV3 With all the items we need imported, we start by connecting to the database. This is done by creating an engine, we'll use in our BaseService. .. code-block:: python engine = redshift_create_engine() Next, we initialize a Base to use as the foundation of our data tables. This base stores important metadata about our tables, and how they function inside the database. .. code-block:: python Base = declarative_base() Data Tables =========== We begin the main part of our application by defining the data tables that will be used. The data tables are represented by classes, and those classes must inherit from the Base we created in the previous step in order to have access to all the query features available in the platform. .. code-block:: python class Census(Base): __table__ = Table('census', Base.metadata, Column('state', String(30), primary_key=True), Column('sex', String(1)), Column('age', Float()), Column('pop2000', Float()), Column('pop2008', Float()), schema='demo', extend_existing=True) Tables are defined and assigned to the ``__table__`` attribute of the class. The ``Table`` constructor takes a table name, an instance of ``Base.metadata``, a collection of ``Column`` objects, a schema name that we will provide to you, and a setting ``extend_existing=True`` to ensure that if the table is already defined that our new configuration is still applied. The name must match the table name present in Redshift. Column objects are defined by a name, type, and other attributes. You can learn more about Column object types in the `SQLAlchemy docs for types `_. At least one of the Column objects just be marked as a ``primary_key``, and is used by the metadata and other Juicebox features. Ingredients =========== Ingredients represent the data we will be using later in data services for our slices. There are three main types of ingredients: Metrics, Dimensions, and Filters. Metrics are values, which are often numeric calculations. Dimensions are typically categorical values that represent a grouping or aggregation. For example, if you wanted to know the average height by gender, "average height" would be a Metric, and "gender" would be a Dimension. It's important to note that multiple columns from a table can be used to compute a Metric or Dimension. A Filter is used to limit the results of a recipe based on some metric or dimension. Metrics and Dimensions are all defined with a reference to a column from a data table or a function operating on one or more columns. They also use keywords that define the visual formatting of the Ingredient. The ``singular`` and ``plural`` keywords specify what the singular and plural descriptions of the ingredient should be. To control the format of numbers, you can use the ``format`` keyword. If you need an expression to be output that includes the ingredient everywhere it appears, you can use the ``formatter`` keyword; however, this is often unnecessary. An example Metric: .. code-block:: python Metric(func.sum(Census.pop2008 - Census.pop2000), format=".3s", singular='Population Growth') An example Dimension: .. code-block:: python Dimension(Census.state, singular='State', plural='States', format="") Recipe Service ============== Recipe services are the core elements of a Juicebox application. They are made up of collections of Metrics and Dimensions coupled with additional configuration settings. All Juicebox 3 recipe services inherit from the ``RecipeServiceBaseV3`` we imported earlier, and define a ``metric_shelf`` and a ``dimension_shelf`` which contain the Ingredients we wish to use in our data services. Here is an example of a basic service. .. code-block:: python class BasicService(RecipeServiceBaseV3): metric_shelf = { 'pop2000': Metric(func.sum(Census.pop2000), singular='Population 2000'), 'pop2008': Metric(func.sum(Census.pop2008), singular='Population 2008'), 'popdiff': Metric(func.sum(Census.pop2008 - Census.pop2000), singular='Population Growth'), 'avgage': Metric(func.sum(Census.pop2008 * Census.age) / func.sum(Census.pop2008), singular='Average Age'), # A metric using a complex expression 'pctfemale': Metric(func.sum(case([(Census.sex == 'F', Census.pop2008)], else_=0)) / func.sum(Census.pop2008), singular='% Female'), # a metric using a formatter 'pctdiff': Metric(func.sum(Census.pop2008 - Census.pop2000) / func.sum(Census.pop2000), singular='Population Pct Change', formatters=[lambda x: "Change is {0:0.1%} " "percent".format(x)]), } dimension_shelf = { # Simplest possible dimension, a SQLAlchemy expression and a label. 'state': Dimension(Census.state, singular='State'), # This will use the lookup to get display values of "M" and "F" 'sex': LookupDimension(Census.sex, singular='Sex', lookup={'M': 'Menfolk', "F": "Womenfolk"}), } Our recipe service can also specify some values to filter on automatically. By default every slice listens to the slice above it to adjust its view of the data. They are a list of keys on which every slice should be automatically filtered if it is present in the request. It typically matches the dimensions and metrics used in the FilterService, which we will discuss in the next chapter. An example of using ``automatic_filter_keys``. .. code-block:: python automatic_filter_keys = ('state',) It is also possible to target a specific database by using the ``database`` attribute on a recipe service. This allows you to specify which database the recipes should be run against. Most commonly this is used to target the ``juicebox`` or ``healthstream`` databases. This is not required and by default targets the ``juicebox`` redshift instance. An example of specifying the ``database``. .. code-block:: python database = 'juicebox' We're all set for building our global filters now.