I am creating a dashboard in dash for a course at university. I created 3 histograms however, there are many unique values which give a long range of x values. In my plots I would like to show only the 10 or 20 values that have the highest count (top 10 values). Can someone help me out?
import plotly.express as px
from jupyter_dash import JupyterDash
import dash_core_components as dcc
import dash_html_components as html
from dash.dependencies import Input, Output
# Build App
app = JupyterDash(__name__)
app.layout = html.Div([
html.H1("forensics "),
dcc.Graph(id='graph'),
dcc.Graph(id='graph1'),
dcc.Graph(id='graph2'),
html.Label([
"select market",
dcc.Dropdown(
id='market', clearable=False,
value='whitehousemarket', options=[
{'label': c, 'value': c}
for c in posts['marketextract'].unique()
])
]),
])
# Define callback to update graph
@app.callback(
Output('graph', 'figure'),
Output('graph1', 'figure'),
Output('graph2', 'figure'),
[Input("market", "value")]
)
def update_figure(market):
fig=px.histogram(x=posts['datetime'].loc[posts['marketextract']==market])
fig1=px.histogram(x=posts['username'].loc[posts['marketextract']==market])
fig2=px.histogram(x=posts['drugs'].loc[posts['marketextract']==market])
return [fig, fig1, fig2]
# Run app and display result inline in the notebook
app.run_server(mode='inline')
To my knowledge, px.histogram()
does not have a method to exclude certain observations of bins. But judging by the look of your data (please consider sharing a proper sample), what you're doing here is just showing the different counts of some user names. And you can easily do that through a combination of df.groupby()
and px.histogram
. Or px.bar()
or go.Bar()
for that matter, but we'll stick with px.histogram
since that is what you're seeking help with. Anyway, using random selections of country names from px.gapminder
you can use:
dfg = df.groupby(['name']).size().to_frame().sort_values([0], ascending = False).head(10).reset_index()
fig = px.histogram(dfg, x='name', y = 'count')
And get:
If you drop .head(10)
you'll get this instead:
And I hope this is the sort of functionality you were looking for. And don't be intimidated by the long df.groupby(['name']).size().to_frame().sort_values([0], ascending = False).reset_index()
. I'm not a pandas expert, so you could quite possibly find a more efficient approach. But it does the job. Here's the complete code with some sample data:
# imports
import pandas as pd
import plotly.express as px
import random
# data sample
gapminder = list(set(px.data.gapminder()['country']))[1:20]
names = random.choices(gapminder, k=100)
# data munging
df = pd.DataFrame({'name':names})
dfg = df.groupby(['name']).size().to_frame().sort_values([0], ascending = False).reset_index()
dfg.columns = ['name', 'count']
# plotly
fig = px.histogram(dfg, x='name', y = 'count')
fig.layout.yaxis.title.text = 'count'
fig.show()
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With