============ Distribution ============ .. include:: ../../incomplete.rst .. contents:: A Distribution slice is used to present the user with a view of the data separated into buckets and then into individual results within that grouping. It currently only has a default flavor. Distribution config ================================ Distribution slices support the :doc:`common_configuration`. Additional options are: countField -------------------------- Field name in data items that could be used for total count in each group/bin. :Optional: Yes, by default the count total in each bin is the number of cells :Values: CSS selector :Example: .. code-block:: python config: countField: "count" scaleCellSize -------------------------- Normally, cells in distribution are fixed-sized. However, sometimes cell value needs to represent the entire group/bin value and needs to scale according to its value. This option is useful when bins have values that are not feasible to draw as cells, and by setting the `scaleCellSize` to true, a single cell could scale accordingly and represent the entire bin value (so bins could be compared with each other) :Optional: Yes, the default is false (cells are fixed-sized) :Values: true|false :Example: .. code-block:: python config: scaleCellSize: true cellTemplateName -------------------------- The name (CSS Selector) of an HTML template used to render distribution data items/cells. Depending on the cell size, the template content will automatically be assigned one of the class names: ``.content-minimum``, ``.content-medium`` or ``.content-maximum``. Please refer to ``_distribution .scss`` file to see what each of these classes does to the template content. :Optional: Yes, there is a default template in distribution plugin file :Values: CSS selector :Example: .. code-block:: python config: cellTemplateName: #my-template cellSizeRange -------------------------- An array of maximum two items. Each item has [width, height] and will be used to determine one of the minimum/medium/maximum visibility modes for a cell. If the rendered cell is smaller than the first [width, height] in cellSizes, it will be assigned ``content-minimum`` class, if it is bigger than the first [width, height], but smaller the second, it will be assigned ``content-medium`` class, otherwise it will just be assigned ``content-maximum``. See class definitions in ``_distribution.scss``. If you want to have just two visibility modes (minimum/maximum), set ``cellSizeRange`` to a single item, like [ [width, height] ], if you want to have three visibility modes, set ``cellSizeRange`` to [ [width1, height1], [width2, height2] ]. No more than two items in ``cellSizeRange`` are allowed :Optional: Yes, default is [ [70, 20] ] (subject to change) :Values: Array :Example: .. code-block:: python config: cellSizeRange: [ [100, 20] ] groupWidthRange -------------------------- An array that represents the group width range (in px) :Optional: Yes, default is [100, 300] :Values: Array of numbers :Example: .. code-block:: python config: groupWidthRange: [200,200] colors ----------------- Defines the background color of the cells. **colors** is a key-value pair, where the *key* is the name of a dataset and *value* is an object that describes the color for that dataset. User can define a **default** dataset, which will be used when the dataset name is not found among **colors** keys. A typical **colors** looks like this: .. code-block:: python config: colors: default: range: ["#f00", "#00f"] domain: [0, 100], field: "score" dataset1: range: ["#f00", "#00f"] field: "value" :Optional: yes :Values: a nested object in form "{datasetName: {domain: [], range: [], field: "fieldName"}}" **range**: the color range (usually min/max hex values) **domain** (optional): the range of values (typically the min/max data values) that are mapped to the range colors **field**: the property name in the data items that should be used as a value to color (that will be mapped to the color) :Example: .. code-block:: python config: colors: default: range: ["#f00", "#00f"] domain: [0, 100], field: "score" dataset1: range: ["#f00", "#00f"] field: "value" Flavors of Distribution ======================= .. warning:: Distribution doesn't perform well when there are large numbers of items. Use the ``bars`` flavor of distribution when the number of items is large. Default flavor (distribution) ----------------------------- The default flavor renders values grouped into buckets on the distinct elements of a grouping dimension. Within those buckets distinct items of another dimension (the grain dimension) are displayed. The value of a single metric for that dimension (value) . .. image:: images/distribution-defaultflavor.png An example with the default flavor looks like this: .. code-block:: python class DistributionDefaultFlavorService(CensusService): """ Default flavor requires two dimensions. The first dimension is the group_dimension that creates groups. The second dimension is the grain_dimension which defines the items that appear in the groups. """ metric_shelf = { 'pop2000': Metric(func.sum(Census.pop2000), label='Population 2000', format=".3s", singular="Population 2000", plural="Population 2000"), } # Dimensions are ways to split the data. dimension_shelf = { 'age': Dimension(Census.age, singular='Age', plural='Ages', format=".2f"), 'age_bands': Dimension(case([(Census.age < 21, 'Under 21'), (Census.age < 49, '21-49') ], else_='Other'), label='Age Bands'), } def build_response(self): self.metrics = ('pop2000',) self.dimensions = ('age_bands', 'age') recipe = self.recipe().dimensions(*self.dimensions) \ .metrics(*self.metrics).order_by(*self.dimensions) self.response['responses'].append(recipe.render()) The slice in stack.yaml: .. code-block:: yaml - slice_type: "distribution" slug: "distribution_defaultflavor" title: "Default flavor for distribution uses one dimension for groups and one for items" config: "cellTemplateName": "#distribution-template" data_service: "detailservice.DistributionDefaultFlavorService" The cellTemplateName is a template that controls how individual items are displayed. In this case, it is .. code-block:: html The cell template has to display results in a fixed height that it does not control. Default flavor with ordered buckets ----------------------------------- The default flavor can control ordering by providing a list of bucket labels. The show_all option will display buckets even if there are no items in them. Here's an example. .. image:: images/distribution-defaultflavorwithorder.png .. code-block:: python class DistributionDefaultFlavorWithOrderService(CensusService): """ The default flavor can control ordering by providing a list of bucket labels. The show_all option will display buckets even if there are no items in them. """ metric_shelf = { 'pop2000': Metric(func.sum(Census.pop2000), label='Population 2000', format=".3s", singular="Population 2000", plural="Population 2000"), } # Dimensions are ways to split the data. dimension_shelf = { 'age': Dimension(Census.age, singular='Age', plural='Ages', format=".2f"), 'age_bands': Dimension(case([(Census.age < 21, 'Under 21'), (Census.age < 49, '21-49') ], else_='Other'), label='Age Bands'), } def build_response(self): self.metrics = ('pop2000',) self.dimensions = ('age_bands', 'age') recipe = self.recipe().dimensions(*self.dimensions) \ .metrics(*self.metrics).order_by(*self.dimensions) self.response['responses'].append(recipe.render(render_config={ 'order': ['Under 21', '21-49', 'Other', 'Misc'], 'show_all': True })) The stack config is the same as previous. .. code-block:: yaml - slice_type: "distribution" slug: "distribution_withorder" title: "Distributions can be put in a specific order and can show groups with no data" config: "cellTemplateName": "#distribution-template" data_service: "detailservice.DistributionDefaultFlavorWithOrderService" Default flavor with custom groupings ------------------------------------ Sometimes you can't create a Dimension to group the buckets. In these cases you'll need to define the groupings with a python function. The default flavor provides a ``grouper`` option to generate the groups. .. image:: images/distribution-defaultflavorgrouper.png .. code-block:: python class DistributionCustomGroupingService(CensusService): """ The default flavor can provide a custom grouping function to make the groups. """ metric_shelf = { 'pop2000': Metric(func.sum(Census.pop2000), label='Population 2000', format=".3s", singular="Population 2000", plural="Population 2000"), } dimension_shelf = { 'state': Dimension(Census.state, singular='State', plural='States', format=".f"), } def build_response(self): """ You can define groups using a function """ def regional_grouping(row): if row.state in ( "Connecticut", "Maine", "Massachusetts", "New Hampshire", "Rhode Island", "Vermont", "New Jersey", "New York", "Pennsylvania"): return "Northeast" elif row.state in ( "Illinois", "Indiana", "Michigan", "Ohio", "Wisconsin", "Iowa", "Kansas", "Minnesota", "Missouri", "Nebraska", "North Dakota", "South Dakota"): return "Midwest" elif row.state in ( "Delaware", "District of Columbia", "Florida", "Georgia", "Maryland", "North Carolina", "South Carolina", "Virginia", "West Virginia", "Alabama", "Kentucky", "Mississippi", "Tennessee", "Arkansas", "Louisiana", "Oklahoma", "Texas"): return "South" elif row.state in ( "Arizona", "Colorado", "Idaho", "Montana", "Nevada", "New Mexico", "Utah", "Wyoming", "Alaska", "California", "Hawaii", "Oregon", "Washington"): return "West" else: return "Other" self.metrics = ('pop2000',) self.dimensions = ('state',) recipe = self.recipe().dimensions(*self.dimensions) \ .metrics(*self.metrics).order_by(*self.dimensions) response = recipe.render( name="States", render_config={'grouper': regional_grouping} ) self.response['responses'].append(response) The stack config uses a different cellTemplateName. .. code-block:: yaml - slice_type: "distribution" slug: "distribution_customgroup" title: "Distributions can define custom groupings" config: "cellTemplateName": "#state-template" data_service: "detailservice.DistributionCustomGroupingService" The new template is. .. code-block:: html Default flavor with colored cells --------------------------------- Cells can be colored with the ``colors`` config option. Here's an example that builds off the previous example. .. image:: images/distribution-defaultflavorcolored.png This is how we rendered the previous recipe. The ``name`` from the ``recipe.render`` appears in the config. .. code-block:: python response = recipe.render( name="States", render_config={'grouper': regional_grouping} ) The key ``"States"`` in the colors config is the ``name`` of the response that should be colored. .. code-block:: yaml - slice_type: "distribution" slug: "distribution_colored" title: "Cells can be colored with the colors config option" config: "cellTemplateName": "#state-template" "colors": "States": "range": ["#f60", "#096"] "field": "value" data_service: "detailservice.DistributionCustomGroupingService" Changing the summary value for groups ------------------------------------- The default summary value for groups is the number of items in that group. You can change it to the sum of the value for items in that group by supplying ``countField`` in the config. .. image:: images/distribution-countfield.png .. code-block:: python class DistributionCustomGroupingService(CensusService): """ The default flavor can provide a custom grouping function to make the groups. """ metric_shelf = { 'pop2000': Metric(func.sum(Census.pop2000), format=".3s", singular="Total population in 2000", plural="Total population in 2000"), } dimension_shelf = { 'state': Dimension(Census.state, singular='State', plural='States', format=".f"), } def build_response(self): """ You can define groups using a function """ def regional_grouping(row): if row.state in ( "Connecticut", "Maine", "Massachusetts", "New Hampshire", "Rhode Island", "Vermont", "New Jersey", "New York", "Pennsylvania"): return "Northeast" elif row.state in ( "Illinois", "Indiana", "Michigan", "Ohio", "Wisconsin", "Iowa", "Kansas", "Minnesota", "Missouri", "Nebraska", "North Dakota", "South Dakota"): return "Midwest" elif row.state in ( "Delaware", "District of Columbia", "Florida", "Georgia", "Maryland", "North Carolina", "South Carolina", "Virginia", "West Virginia", "Alabama", "Kentucky", "Mississippi", "Tennessee", "Arkansas", "Louisiana", "Oklahoma", "Texas"): return "South" elif row.state in ( "Arizona", "Colorado", "Idaho", "Montana", "Nevada", "New Mexico", "Utah", "Wyoming", "Alaska", "California", "Hawaii", "Oregon", "Washington"): return "West" else: return "Other" self.metrics = ('pop2000',) self.dimensions = ('state',) recipe = self.recipe().dimensions(*self.dimensions) \ .metrics(*self.metrics).order_by(*self.dimensions) response = recipe.render( name="States", render_config={'grouper': regional_grouping} ) self.response['responses'].append(response) The stack config is the following. .. code-block:: yaml - slice_type: "distribution" slug: "distribution_value" title: "The metric at the top of the group can sum the values of items" config: "cellTemplateName": "#state-template" "countField": "value" data_service: "detailservice.DistributionCustomGroupingService" You can override the format for the summary value. .. code-block:: python response = recipe.render( name="States", render_config={'grouper': regional_grouping} ) # Override the metadata.{response_name}.format # with a juicebox format. response['metadata']['States']['format'] = '"Total: ",.0f' self.response['responses'].append(response) .. image:: images/distribution-countfield-customformat.png Bars flavor: showing bars rather than items ------------------------------------------- You can show the value of the summary metric in each group rather than individual items using the ``bars`` flavor. You may want to dynamically switch between using ``bars`` and the ``default`` flavor by counting the number of items that you have before creating a response. The ``bars`` flavor supports the ``order`` and ``show_all`` render_config options just like the default flavor. .. image:: images/distribution-barsflavor.png .. code-block:: python class DistributionBarsFlavor(CensusService): """ Distribution can be used to show bars for the value metric within each group. """ metric_shelf = { 'pop2000': Metric(func.sum(Census.pop2000), label='Population 2000', format=".3s", singular="Population 2000", plural="Population 2000"), } # Dimensions are ways to split the data. dimension_shelf = { 'age': Dimension(Census.age, singular='Age', plural='Ages', format=".2f"), 'age_bands': Dimension(case([(Census.age < 21, 'Under 21'), (Census.age < 49, '21-49') ], else_='Other'), label='Age Bands'), } def build_response(self): self.metrics = ('pop2000',) self.dimensions = ('age_bands',) recipe = self.recipe().dimensions(*self.dimensions) \ .metrics(*self.metrics).order_by(*self.dimensions) self.response['responses'].append(recipe.render(flavor='bars')) The stack config is the following. .. code-block:: yaml - slice_type: "distribution" slug: "distribution_bars" title: "Distributions can show bars for the number of items" config: {} data_service: "detailservice.DistributionBarsFlavor"