Simulate

Simulation

class pysimmmulator.simulate.Multisim

Provides capability to generate multiple runs on a single configuration

property get_data

Provies the iterable generator for simulaton final dataframes and channel ground truth ROI values

Parameters:

None

Returns:

iterable of final sim dataframes and channel ROI values

Return type:

data (iterable)

run(config: dict, runs: int) None
stash_outputs(final_df: DataFrame, channel_roi: dict)

Stores the final simulation dataframe as well as the ground truth channel ROI values for each run of the multiple simulations.

class pysimmmulator.simulate.Simulate(basic_params: BasicParameters = None, random_seed=None)

Takes input of basic params and provies either piece meal or single shot creation of MMM data using a config file

calculate_channel_roi(mmm_df: DataFrame) dict

Calculates the ROI for all channels, based on pre-generated spend and conversions data

Parameters:

mmm_df (pd.DataFrame) – Consolidated MMM DataFrame

Returns:

Channel ROI mapping

Return type:

dict

calculate_conversions(mmm_df: DataFrame) DataFrame

Calculates row wise values for conversions based on the noisy cvr and the adstocked media metric associated with each channel.

Parameters:

mmm_df (pd.DataFrame) – MMM input DataFrame

Returns:

Updated mmm_df

Return type:

pd.DataFrame

consolidate_dataframe(mmm_df: DataFrame, baseline_sales_df: DataFrame) DataFrame

Filters and formats internal data into uniform output.

Parameters:
  • mmm_df (pd.DataFrame) – MMM input DataFrame

  • baseline_sales_df (pd.DataFrame) – Baseline sales DataFrame

Returns:

Consolidated MMM DataFrame

Return type:

pd.DataFrame

finalize_output(mmm_df: DataFrame, aggregation_level: str) DataFrame

Provide aggregation (daily, weekly) and column filtering for final output

Parameters:
  • mmm_df (pd.DataFrame) – Consolidated MMM DataFrame

  • aggregation_level (str) – [daily, weekly] the granulatiry at which to return output data

Returns:

Finalized output DataFrame

Return type:

pd.DataFrame

run_with_config(config: dict) tuple[DataFrame, dict]
simulate_ad_spend(baseline_sales_df: DataFrame, campaign_spend_mean: int, campaign_spend_std: int, max_min_proportion_on_each_channel: dict) DataFrame

Simulation of ad spend based on normal distribution parameters for campaign spend. Overall campaign spend is then divided amongst each channel based on passed min-max proportionality.

Parameters:
  • baseline_sales_df (pd.DataFrame) – DataFrame containing baseline sales

  • campaign_spend_mean (int) – The average amount of money spent on a campaign.

  • campaign_spend_std (int) – The standard deviation of money spent on a campaign

  • max_min_proportion_on_each_channel (dict) – Specifies the minimum and maximum percentages of total spend allocated to each channel.

Returns:

DataFrame containing ad spend data

Return type:

pd.DataFrame

simulate_baseline(base_p: int, trend_p: int, temp_var: int, temp_coef_mean: int, temp_coef_sd: int, error_std: int) DataFrame

Simulation of baseline sales and revenue for the subject business.

The simulation calculates daily baseline sales as a sum of: - Base sales: A constant value (base_p) - Trend: Linear growth over the period (total growth of trend_p) - Seasonality: Modeled via a sine function (height temp_var) scaled by a random

importance coefficient (mean temp_coef_mean, std temp_coef_sd)

  • Error: Gaussian noise (std error_std)

If the combined terms result in negative sales, they are clamped to zero.

Parameters:
  • base_p (int) – Daily base sales units (non-marketing driven).

  • trend_p (int) – Total linear growth units over the full duration.

  • temp_var (int) – Amplitude of the seasonal sine function.

  • temp_coef_mean (int) – Mean scaling factor for seasonal impact.

  • temp_coef_sd (int) – Standard deviation of seasonal impact scaling.

  • error_std (int) – Standard deviation of daily statistical noise.

Returns:

Daily baseline sales components.

Return type:

pd.DataFrame

simulate_cvr(spend_df: DataFrame, noisy_cvr: dict) DataFrame

Generate Conversion Rate using the true conversion rates passed in the basic params with noise parameters passed in this function.

Parameters:
  • spend_df (pd.DataFrame) – DataFrame containing ad spend data

  • noisy_cvr (dict) – Specifies the bias and scale of noise added to the true value CVR for each channel.

Returns:

Updated spend DataFrame

Return type:

pd.DataFrame

simulate_decay_returns(spend_df: DataFrame, adstock: dict, saturation: dict) DataFrame

Generates the decay and returns values associated with ad stocking and diminishing returns.

Parameters:
  • spend_df (pd.DataFrame) – DataFrame containing ad spend data

  • adstock (dict) – Nested dictionary for adstock configuration

  • saturation (dict) – Nested dictionary for saturation configuration

Returns:

MMM input DataFrame with decay and returns applied

Return type:

pd.DataFrame

simulate_geos(mmm_df: DataFrame, geo_params: dict) DataFrame

Distributes the consolidated MMM dataframe into geographies.

Parameters:
  • mmm_df (pd.DataFrame) – Consolidated MMM DataFrame

  • geo_params (dict) – Parameters for geographic distribution

Returns:

MMM DataFrame with geographic distribution

Return type:

pd.DataFrame

simulate_media(spend_df: DataFrame, true_cpm: dict, true_cpc: dict, noisy_cpm_cpc: dict) DataFrame

Simulation of relevant media metrics for each channel. True values are passed and noise is applied in accordance with a normal distribution described within the noisy dict. Media metrics are checked for 0 values stemming from the random noise applied and will be flagged with logger when found. It is generally understood that negativ evalues should not arrise for media metrics.

Parameters:
  • spend_df (pd.DataFrame) – DataFrame containing ad spend data

  • true_cpm (dict) – Specifies the true Cost per Impression (CPM) of each channel (noise will be added to this to simulate number of impressions)

  • true_cpc (dict) – Specifies the true Cost per Click (CPC) of each channel (noise will be added to this to simulate number of clicks)

  • noisy_cpm_cpc (dict) – Specifies the bias and scale of noise added to the true value CPM or CPC for each channel.

Returns:

Updated spend DataFrame

Return type:

pd.DataFrame

Visualization

class pysimmmulator.visualize.Visualize
plot_clicks(df: DataFrame, agg: str = None)

Plot simulated clicks data based on a passed date-wise aggregation

Parameters:
  • df (pd.DataFrame) – DataFrame containing simulated data

  • agg (str) – pick from [‘daily’, ‘weekly’, ‘monthly’, ‘yearly’] to aggregate simulated data by

plot_impressions(df: DataFrame, agg: str = None)

Plot simulated impressions data based on a passed date-wise aggregation

Parameters:
  • df (pd.DataFrame) – DataFrame containing simulated data

  • agg (str) – pick from [‘daily’, ‘weekly’, ‘monthly’, ‘yearly’] to aggregate simulated data by

plot_revenue(df: DataFrame, agg: str = None)

Plot simulated revenue data based on a passed date-wise aggregation

Parameters:
  • df (pd.DataFrame) – DataFrame containing simulated data

  • agg (str) – pick from [‘daily’, ‘weekly’, ‘monthly’, ‘yearly’] to aggregate simulated data by

plot_spend(df: DataFrame, agg: str = None)

Plot simulated spend data based on a passed date-wise aggregation

Parameters:
  • df (pd.DataFrame) – DataFrame containing simulated data

  • agg (str) – pick from [‘daily’, ‘weekly’, ‘monthly’, ‘yearly’] to aggregate simulated data by