Simulate
Simulation
- class pysimmmulator.simulate.Multisim
Provides capability to generate multiple runs on a single configuration
- property get_data
Provies the iterable generator for simulaton final dataframes and channel ground truth ROI values
- Parameters:
None
- Returns:
iterable of final sim dataframes and channel ROI values
- Return type:
data (iterable)
- run(config: dict, runs: int) None
- stash_outputs(final_df: DataFrame, channel_roi: dict)
Stores the final simulation dataframe as well as the ground truth channel ROI values for each run of the multiple simulations.
- class pysimmmulator.simulate.Simulate(basic_params: BasicParameters = None, random_seed=None)
Takes input of basic params and provies either piece meal or single shot creation of MMM data using a config file
- calculate_channel_roi(mmm_df: DataFrame) dict
Calculates the ROI for all channels, based on pre-generated spend and conversions data
- Parameters:
mmm_df (pd.DataFrame) – Consolidated MMM DataFrame
- Returns:
Channel ROI mapping
- Return type:
dict
- calculate_conversions(mmm_df: DataFrame) DataFrame
Calculates row wise values for conversions based on the noisy cvr and the adstocked media metric associated with each channel.
- Parameters:
mmm_df (pd.DataFrame) – MMM input DataFrame
- Returns:
Updated mmm_df
- Return type:
pd.DataFrame
- consolidate_dataframe(mmm_df: DataFrame, baseline_sales_df: DataFrame) DataFrame
Filters and formats internal data into uniform output.
- Parameters:
mmm_df (pd.DataFrame) – MMM input DataFrame
baseline_sales_df (pd.DataFrame) – Baseline sales DataFrame
- Returns:
Consolidated MMM DataFrame
- Return type:
pd.DataFrame
- finalize_output(mmm_df: DataFrame, aggregation_level: str) DataFrame
Provide aggregation (daily, weekly) and column filtering for final output
- Parameters:
mmm_df (pd.DataFrame) – Consolidated MMM DataFrame
aggregation_level (str) – [daily, weekly] the granulatiry at which to return output data
- Returns:
Finalized output DataFrame
- Return type:
pd.DataFrame
- run_with_config(config: dict) tuple[DataFrame, dict]
- simulate_ad_spend(baseline_sales_df: DataFrame, campaign_spend_mean: int, campaign_spend_std: int, max_min_proportion_on_each_channel: dict) DataFrame
Simulation of ad spend based on normal distribution parameters for campaign spend. Overall campaign spend is then divided amongst each channel based on passed min-max proportionality.
- Parameters:
baseline_sales_df (pd.DataFrame) – DataFrame containing baseline sales
campaign_spend_mean (int) – The average amount of money spent on a campaign.
campaign_spend_std (int) – The standard deviation of money spent on a campaign
max_min_proportion_on_each_channel (dict) – Specifies the minimum and maximum percentages of total spend allocated to each channel.
- Returns:
DataFrame containing ad spend data
- Return type:
pd.DataFrame
- simulate_baseline(base_p: int, trend_p: int, temp_var: int, temp_coef_mean: int, temp_coef_sd: int, error_std: int) DataFrame
Simulation of baseline sales and revenue for the subject business.
The simulation calculates daily baseline sales as a sum of: - Base sales: A constant value (base_p) - Trend: Linear growth over the period (total growth of trend_p) - Seasonality: Modeled via a sine function (height temp_var) scaled by a random
importance coefficient (mean temp_coef_mean, std temp_coef_sd)
Error: Gaussian noise (std error_std)
If the combined terms result in negative sales, they are clamped to zero.
- Parameters:
base_p (int) – Daily base sales units (non-marketing driven).
trend_p (int) – Total linear growth units over the full duration.
temp_var (int) – Amplitude of the seasonal sine function.
temp_coef_mean (int) – Mean scaling factor for seasonal impact.
temp_coef_sd (int) – Standard deviation of seasonal impact scaling.
error_std (int) – Standard deviation of daily statistical noise.
- Returns:
Daily baseline sales components.
- Return type:
pd.DataFrame
- simulate_cvr(spend_df: DataFrame, noisy_cvr: dict) DataFrame
Generate Conversion Rate using the true conversion rates passed in the basic params with noise parameters passed in this function.
- Parameters:
spend_df (pd.DataFrame) – DataFrame containing ad spend data
noisy_cvr (dict) – Specifies the bias and scale of noise added to the true value CVR for each channel.
- Returns:
Updated spend DataFrame
- Return type:
pd.DataFrame
- simulate_decay_returns(spend_df: DataFrame, adstock: dict, saturation: dict) DataFrame
Generates the decay and returns values associated with ad stocking and diminishing returns.
- Parameters:
spend_df (pd.DataFrame) – DataFrame containing ad spend data
adstock (dict) – Nested dictionary for adstock configuration
saturation (dict) – Nested dictionary for saturation configuration
- Returns:
MMM input DataFrame with decay and returns applied
- Return type:
pd.DataFrame
- simulate_geos(mmm_df: DataFrame, geo_params: dict) DataFrame
Distributes the consolidated MMM dataframe into geographies.
- Parameters:
mmm_df (pd.DataFrame) – Consolidated MMM DataFrame
geo_params (dict) – Parameters for geographic distribution
- Returns:
MMM DataFrame with geographic distribution
- Return type:
pd.DataFrame
- simulate_media(spend_df: DataFrame, true_cpm: dict, true_cpc: dict, noisy_cpm_cpc: dict) DataFrame
Simulation of relevant media metrics for each channel. True values are passed and noise is applied in accordance with a normal distribution described within the noisy dict. Media metrics are checked for 0 values stemming from the random noise applied and will be flagged with logger when found. It is generally understood that negativ evalues should not arrise for media metrics.
- Parameters:
spend_df (pd.DataFrame) – DataFrame containing ad spend data
true_cpm (dict) – Specifies the true Cost per Impression (CPM) of each channel (noise will be added to this to simulate number of impressions)
true_cpc (dict) – Specifies the true Cost per Click (CPC) of each channel (noise will be added to this to simulate number of clicks)
noisy_cpm_cpc (dict) – Specifies the bias and scale of noise added to the true value CPM or CPC for each channel.
- Returns:
Updated spend DataFrame
- Return type:
pd.DataFrame
Visualization
- class pysimmmulator.visualize.Visualize
- plot_clicks(df: DataFrame, agg: str = None)
Plot simulated clicks data based on a passed date-wise aggregation
- Parameters:
df (pd.DataFrame) – DataFrame containing simulated data
agg (str) – pick from [‘daily’, ‘weekly’, ‘monthly’, ‘yearly’] to aggregate simulated data by
- plot_impressions(df: DataFrame, agg: str = None)
Plot simulated impressions data based on a passed date-wise aggregation
- Parameters:
df (pd.DataFrame) – DataFrame containing simulated data
agg (str) – pick from [‘daily’, ‘weekly’, ‘monthly’, ‘yearly’] to aggregate simulated data by
- plot_revenue(df: DataFrame, agg: str = None)
Plot simulated revenue data based on a passed date-wise aggregation
- Parameters:
df (pd.DataFrame) – DataFrame containing simulated data
agg (str) – pick from [‘daily’, ‘weekly’, ‘monthly’, ‘yearly’] to aggregate simulated data by
- plot_spend(df: DataFrame, agg: str = None)
Plot simulated spend data based on a passed date-wise aggregation
- Parameters:
df (pd.DataFrame) – DataFrame containing simulated data
agg (str) – pick from [‘daily’, ‘weekly’, ‘monthly’, ‘yearly’] to aggregate simulated data by