Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Dash/plotly, show only top 10 values in histogram

Tags:

I am creating a dashboard in dash for a course at university. I created 3 histograms however, there are many unique values which give a long range of x values. In my plots I would like to show only the 10 or 20 values that have the highest count (top 10 values). Can someone help me out?

import plotly.express as px
from jupyter_dash import JupyterDash
import dash_core_components as dcc
import dash_html_components as html
from dash.dependencies import Input, Output

# Build App
app = JupyterDash(__name__)
app.layout = html.Div([
   html.H1("forensics "),
   dcc.Graph(id='graph'),
   dcc.Graph(id='graph1'),
   dcc.Graph(id='graph2'),
   html.Label([
        "select market",
        dcc.Dropdown(
            id='market', clearable=False,
            value='whitehousemarket', options=[
                {'label': c, 'value': c}
                for c in posts['marketextract'].unique()
            ])
    ]),
])
# Define callback to update graph
@app.callback(
    Output('graph', 'figure'),
    Output('graph1', 'figure'),
    Output('graph2', 'figure'),
    [Input("market", "value")]
)
def update_figure(market):
    fig=px.histogram(x=posts['datetime'].loc[posts['marketextract']==market])
    fig1=px.histogram(x=posts['username'].loc[posts['marketextract']==market])
    fig2=px.histogram(x=posts['drugs'].loc[posts['marketextract']==market])
    return [fig, fig1, fig2]



# Run app and display result inline in the notebook
app.run_server(mode='inline')

enter image description here

like image 957
Antoine Van Esch Avatar asked Apr 11 '21 13:04

Antoine Van Esch


1 Answers

To my knowledge, px.histogram() does not have a method to exclude certain observations of bins. But judging by the look of your data (please consider sharing a proper sample), what you're doing here is just showing the different counts of some user names. And you can easily do that through a combination of df.groupby() and px.histogram. Or px.bar() or go.Bar() for that matter, but we'll stick with px.histogram since that is what you're seeking help with. Anyway, using random selections of country names from px.gapminder you can use:

dfg = df.groupby(['name']).size().to_frame().sort_values([0], ascending = False).head(10).reset_index()
fig = px.histogram(dfg, x='name', y = 'count')

And get:

enter image description here

If you drop .head(10) you'll get this instead:

enter image description here

And I hope this is the sort of functionality you were looking for. And don't be intimidated by the long df.groupby(['name']).size().to_frame().sort_values([0], ascending = False).reset_index(). I'm not a pandas expert, so you could quite possibly find a more efficient approach. But it does the job. Here's the complete code with some sample data:

# imports
import pandas as pd
import plotly.express as px
import random

# data sample
gapminder = list(set(px.data.gapminder()['country']))[1:20]
names = random.choices(gapminder, k=100)

# data munging
df = pd.DataFrame({'name':names})
dfg = df.groupby(['name']).size().to_frame().sort_values([0], ascending = False).reset_index()
dfg.columns = ['name', 'count']

# plotly
fig = px.histogram(dfg, x='name', y = 'count')
fig.layout.yaxis.title.text = 'count'
fig.show()
like image 104
vestland Avatar answered Oct 15 '22 11:10

vestland