Category Archives: Earnings Management

Data Analytics: Influences of Gross Film Revenue Across Three Decades

 

Data Analytics: Influences of Gross Film Revenue & Opportunity Analysis

December 6, 2017

Todd Benschneider, Austin Deno, Leigh Harris, Sarah Lassiter, Lisa Velesko
Table of Contents

Problem Significance:                                                                                                         3-4

Data Source & Preparation:                                                                                               4-5

Variable Selection:                                                                                                              5

Preliminary Analysis:                                                                                                          6-8

Models:                                                                                                                                  8-12

Insights:                                                                                                                                 12-13


Problem Significance:

Several societal trends can be mined from the data captured in consumer spending patterns of the film industry, especially a comparison of different genres of films which indicate rising and falling patterns of popular fiction. Films, more so than television, literature, or music, closely correlate with upcoming trends by using a responsive pull towards consumer tastes in fiction-fantasy and most accurately reflects the psyche of a generation and its ever shifting emotional underpinnings. The nimble demand responsiveness of filmmakers has become astoundingly proficient at catering to the emotional voids that drive the fiction market and are reflected with clarity in the ever changing mix of successful films. Through the unspoken demand for clearly defined types of storylines, these quickly produced films reveal a meaningful cross-section of a society’s unfulfilled drives and highlights which particular aspects that a society’s members yearn for in their own life situations.

In addition to trending popularity of varying scripts, other valuable economic indicators can be harvested through reverse engineering techniques to capture the downward trending genres that clarify the contextual changes that indicate which previous underlying drives have since been fulfilled through sociological evolution. Marketing professionals are wise to take note of the peaking decline of each passing trend, as those peaks and valleys encapsulate at a macro level of measure, the unspoken barometers reflected in the overall mood of a culture.

In the industries of entertainment and media, consumer spending directed towards different types of fiction produces great insight into the long-term patterns of emotional and economic wants, that are as useful to producers of consumer goods, as they are to providers of entertainment. It is imperative for businesses to be on the forefront of any trend.

Our data set summarizes three decades of consumer spending trends on tales that potentially reveals early predictors of future spending behaviors. It is through the trend forecasting of these patterns of film revenue data, that a business can be on the forefront of meeting changing consumer tastes, whether that firm creates new movie plots, automobiles, or widgets. With insight into the deepest desires of the society around it, a business can tailor its marketing message to align its product with a representative cross-section of every consumers vision, of not who they are, but instead, what they want to be. Few other data sources can provide the insights into the self-identity of fantasy characters as well as film plots and with this three decade dataset, we expect to gauge the tipping points of long term trends and witness the rebounds that those tipping points predicted.

Our team viewed the movie revenue data from the perspective of a movie merchandiser, evaluating which unreleased movies in production would provide our firm with its highest return on investment for movie-themed posters, toys, clothing and related merchandise. The highest budget films command the highest royalty percentages and also require the greatest undiversified commitment of our manufacturing lines to individual movie projects. Because of the risk and profitability factors affiliated with marketing the high budget prospects, our team instead drilled down into the data looking for the more cost effective prospects. Films that maximize the return on investment allow our firm to utilize a more diversified portfolio of projects with more promising cash flows.

With this goal in mind, we chose instead, to use regression models to dig deeper into other categorical data from the set, hoping to find other actionable predictors that could be valuable on a shorter time-line. With that goal, we evaluated the given variables in search of the most significant predictors to cinematic success to determine the confidence of future investments.

Data Source & Preparation:

The data set was originally gathered from IMDb and then sourced directly from Kaggle using 6,820 movies from 1986 to 2016 and includes details such as budget, gross revenue, the production company, country of origin, director, primary genre, movie name, motion picture rating, date released, runtime, IMBd user score, lead star, IMBd user votes, writer, and year released.

Not all movies contained information regarding the budget of the movie.  Those were removed as it was critical in our analysis to be able to collate the relationships for complete data points, especially in regards to budget.  We also investigated the relationship between profit and return on investment between gross and budget independently.

Tableau and Excel were first used to identify the greatest amounts in each respective variable.  This allowed us to postulate our first level of filtering.  R was then used to plot data using histograms, box plots, and scatter plots to consider outliers, run regression models, multicollinearity and direct correlations, identify R-squared and adjusted R-squared, along with Aikaike Information Criterion (AIC) and Bayesian Information Criterion (BIC), to determine goodness of fit, utilized numeric and qualitative predictors, and with interaction.  Charts in Tableau were generated to visually verify the interaction effects.  Tableau, Excel, and R were all used collectively to ultimately determine the strongest correlation, interaction, numeric, and qualitative predictors in using the variables.

Variable Selection:

Response Variable: In our effort to uncover the driving forces behind blockbuster films, we questioned what causes box office achievement. There are far too many flops in show-business; artistic potential is drowned out, consumer trends are completely misinterpreted, and lucrative investments are wasted. We must review success in cinema and provide a supportive study to investors in major motion pictures to appease the masses and create a stable platform for performers, thereby providing a concrete analysis of how gross revenue is determined. We therefore selected “Gross”, defined by our IMDb source as “gross revenue at the box office” as our response variable for all data modeling in this study.


Predictor Variables:
In order to evaluate the best variables to test against our response variable, we created a correlation table (below) to test the relationship amongst the quantitative variables. We focused on which variables could have a strong effect in deciding gross. The motive in tracking down the most determinant variables is so the investor can later account these factors into their decision to support a film.

Correlation Chart Budget Gross Runtime Score Votes
Budget 1 0.680033 0.313064 0.073579 0.451467
Gross 0.680033 1 0.253273 0.229552 0.642904
Runtime 0.313064 0.253273 1 0.417031 0.359817
Score 0.073579 0.229552 0.417031 1 0.470648
Votes 0.451467 0.642904 0.359817 0.470648 1

To no surprise, the correlation that stood out the most was between gross revenue and budget with .68003256. This correlation suggests that a higher budget movie will most likely fund a movie that generates more revenue. As we believe budget is the heaviest deciding factor in funding the crucial elements for a financially successful film, we regard it as our primary predictor variable which our other qualitative and quantitative variables will be matched against.

 

The second highest correlation was found between “gross” revenue and “votes” (that is IMBd viewer reviews on a scale from 1 to 10) at .642904. We can justify this correlation two-fold. First, more “votes” logically means more tickets were purchased to watch the movie in theaters. Second, a high number of votes can drive consumer demand, influencing movie-goers who have not yet viewed the film to either watch or avoid depending on how positive the review was. While our first conclusion is provided after the fact of viewership, the second has the potential to boost viewership, making this variable causal. However, since we cannot account for whether “votes” were causal or coincidental, and since the standard error in a simple regression with gross is very large, we decided not to make it a popular predictor variable in our study. Derived from the votes, we deemed “scores” as unacceptable variables in our models because we cannot control the scores that are given by the reviewers.

 

As “runtime”, the final quantitative variable which refers to the length of the film expressed in minutes, has a relatively moderate correlation with “gross” at positive .2532733, we must take into consideration what this logically means. The correlation expressed as runtime increases, gross revenue also increases. We know that this statement has a limit because if movies were formatted into countless of hours, we cannot logically expect the popularity to rise accordingly. In support we also can see from a simple regression that, like the “votes” variable, “runtime” standard error at 52262 is unacceptably high.

 

As far as qualitative data, we opted to use both primary “genre” and motion picture “rating” as major predictors of gross, as supported by their high multiple r-squared values. We determined these were likely predictors of movie success based on consumer taste.

 

Finally, we decided not to use the production company, country of origin, director, movie name, date released, and year released as these factors would be completely out of control of the film investors. This is due to the variables being too widely diverse to classify accurately since they are spread so thinly across the data.

Preliminary Analysis:

Following our variable selection, we began looking at patterns surrounding the relationships between revenue and movie genre and motion picture rating. It’s important for investors to stay current on consumer trends in order to predict where the big money will be made in the film industry.
Hypothesis Testing:

 

Hypothesis 1:

  1. Since the Action genre and PG-13 rating have the highest gross revenue out of all movies, it is logical to assume that these types will also generate the highest return on an investor’s funding once the production hits the theaters. We have solid evidence that this is true because budget accounts for over 47% of the prediction of a high grossing movie.

H0: Action genre and PG-13 rating have the highest return on investment and an Action PG-13 rated movie will generate the most dollars per dollars invested.
Ha: Action genre and PG-13 rating do not have the highest return on investment.

Genre:

After realizing high correlations between gross and motion picture genre, we dove into separating genres to see which classifications raked in the most at the box office. We found that the movies with the highest gross revenue were Action with
a combined total of over $708 million. By seemingly no coincidence we also noticed that Action movies
had a higher total budget than all other genres. Since budget has a strong linear correlation with gross, we can assume that Action will produce the highest return on investment than any other genre.
Rating:

We similarly compared motion picture ratings to gross revenue to identify that PG-13, R, and PG, respectively, generated the most revenue over the course of the 30 year history and looked at the gross revenue and budget within each sector.


Hypothesis 2: Since popular actors have a strong influence over consumer taste, we can assume that starpower has a significant effect on gross revenue. Since high budget is needed we can also assume that as budget increases, more coveted actors can be casted, resulting in a very popular, high grossing film.

 

H0: Movies with budgets in the upper 3rd quartile will have a significant relationship between star and gross.
H1: Movies with budgets in the upper 3rd quartile will have no relationship between star and gross.

 

Star: We attempted to identify the correlation of stars to gross revenue by exploring the total number of movies that they been the lead in and the sum of the gross revenue for those movies using Tableau.  We believed that particular stars would impact the budget and also impact the gross revenue.  Frequency of a star being in movies could also lead to their popularity and consequently generate more box office revenue as consumer-demand increased to see that star.  In running a regression model, there were specific stars, such as Chris Pratt(1), Daisy Ridley(1), Ellen DeGeneres(1), Felicity Jones(2), Heather Donahue(1), Jennifer Lawrence(8), Louis C.K(1), Neel Sethi(1), Paige O-Hara(1), Quinton Aaron(1), Sam Neill(3), Sam Worthington(4), Scott Weinger(1), and Taylor Kitsch(1) that had significant influence as interacted with budget to predict gross revenue.  With all but Jennifer Lawrence being listed as the star in less than five films and most less than two, as indicated by the number next to each star, we determined that there were additional factors driving this further, such as co-star, if the movie already had a cult following, was a book first, etc.  We did run a sample test using Jennifer Lawrence and Will Smith to note that, at least for these two stars, there was a positive correlation between gross revenue and budget as depicted in the scatterplot below.

   


Models:

Model 1<-lm(d$gross ~ d$budget)

The correlation chart was a basic look at the significance between gross revenue at the box office and film budget. We soon affirmed our prediction that the correlation between budget and gross was causal by running a simple regression. With a multiple r-squared value of .4624, this model shows that 46.24% of gross revenue can be explained by the budget. Budget also has a very low p-value (2e-16), proving to be a significant factor in predicting a high gross. A higher budget movie has greater potential to purchase the necessary artists, talent, and advertising to create a higher grossing product.
Model 2<-lm(d$gross~d$budget+as.factor(genre), data=d)

Using the as.factor for genre we are able to build a second model that explains how a movie budget and genre affects the revenue of a movie. This model had a slightly higher adjusted R-square with .4691. This model also shows that out of all the genres, the most significant ones were Action, Adventure, Animation, Comedy, and Horror. This indicates that these five genres will be more impactful on the revenue of a film with knowledge of the budget of the film. However, without knowing the budget, Comedy, Drama, and Horror have the most significant impact on gross revenue.
However, we know that correlation does not translate to causation. We carefully curbed our analysis with a linear regression model, placing “Gross” as the response variable and “Budget” as a factor of “Genre”. We used budget as a control because we want to know how the effect of dollars invested in a movie, and more specifically movie genre, would be returned. To our surprise, Action was not the most significant factor, Animation was, as confirmed by a lower p-value and a higher coefficient. In fact, the regression explained that with a hypothetical budget of $0, an Animation movie would produce $22.2M more in revenue than an Action movie. This was an astonishing and valuable discovery.  We noted that Action, Adventure, Animation, Comedy, and Horror all had significant influences.
Model 3<-lm(d$gross~d$budget+as.factor(genre)+d$budget*as.factor(genre), data=d)

For our third model we adjusted it to show a model that explains gross revenue with the budget and genre of the film and the interaction effect between budget and genre. This model was slightly better with an adjusted R-Square of .4696. The model showed that a specific genre budget has a slight effect on gross revenue. Budget is more significant for the Action, Comedy, Drama, and Horror genres.

Model 4<-lm(d$gross~d$budget+as.factor(rating)+d$budget*as.factor(rating), data=d)

For our fourth model, we looked at gross revenue with the interaction between budget and rating. This helped us narrow our data to find the most significant rating for gross revenue as budget increases. This model had an adjusted r-square of .4736. Out of all the different ratings, rated R and G movies were the most statistically significant.

 

Looking at just the adjusted r-squared and the AIC/BIC; the fourth model was the best predictor of increasing gross revenue. However, the rating to budget interaction was only slightly better than the genre to budget interaction. Both our third and fourth model narrowed down our data because they took into consideration the genre and rating with respect to budget of the film. These two qualitative variables were the most significant in predicting the gross revenue outside of just the movie budget.  In joining the interaction together, PG-13 and Horror had the highest and only interaction, with a slightly higher R-squared but higher AIC and BIC, therefore prompting us to return to the previous model and generating the below chart to illustrate our findings.

 

Confidence Interval Testing:

 

With the information we gathered from the regression models, we now have an in-depth look at the effect of budget on genre and rating as they relate to gross revenue. However, these findings contradict our earlier hypotheses. To examine our original assumptions, we performed confidence interval testing.

 

First, we subsetted the data by creating a new dataframe with only Action genre movies rated PG-13. Then we created another variable, ROI, by implementing the ROI formula using budget and gross data sets. We took a summary of the data discovering the mean ROI for PG-13 Action movies was .1666255 or 16.67%, which seems reasonable. If an investor was to invest $100,000, they could expect an average gross return of $116,000 after the movie hits theaters. With a sample size of 468, we used the normal distribution and with 97.5% confidence to determine that the range for ROI on this type of movie would fall between .0899811 and .4232491. This is a fairly large range. But we can say confidently that the largest return on investment should be 42.32%.

 

Using the assurance of strong significance, and high coefficient strength of our regression models, we will use the same confidence interval testing on an R rated Horror film to test the strength of our first null hypothesis. We performed the same subsetting technique to attain a dataframe of only R rated Horror movies to gather a set of 173 movies. After removing two extreme outliers, the mean ROI was pinpointed at 2.6610 or 266.1%. The testing gave us 97.5% confidence that the range of expected ROI should fall between 113.89% and 646.1%.

 

Concluding, R rated Horror movies have a 97.5% confidence in producing a high of 646.1% ROI compared to the maximum potential of 42.32% of a PG-13 Action movie.
We can view this practically and justify the logic in Horror movies having the highest total ROI. When looking at the data it seems that horror movies can be made with relatively low budgets and yield much higher profit. Movies like Paranormal Activity and The Blair Witch Project (the two outliers we removed before confidence interval testing) are prime examples of this phenomenon. The Blair Witch Project cost only around $15,000 to make, but made $107,918,810 in box office revenue, a 7,193% ROI. This data will allow us to make the most informed decision in consideration for investing or merchandising.

 

Insights:

In analyzing the data, we uncovered that budget had the strongest significance and correlation to gross revenue.  Genre as a factor of budget, nor rating, influenced the gross revenue more than the budget itself but were highly significant subfactors.  Ratings of “R” and “G” along with genres of Action, Comedy, Drama, and Horror, had the highest significance when factored with budget to gross revenue, as depicted in the charts above.

As score and and votes would come after the fact, an investor or merchandising company looking to predict which movies would gross the highest revenue and consequently have the potential to yield the highest returns on product related to that movie, we would look to an “R” or “G” rated movie that is an Action, Comedy,Drama, or Horror genre specifically. This can be demonstrated by the movie “The Hangover,” which led to a major economic impact in Las Vegas.

In conclusion, while we have familiarized ourselves with the tools and theories of data mining for business applications, the most important lesson we have learned, has been to view data insights with cautious skepticism. We are confident that our regression analysis was accurate and that our data source appeared reliable; however, few of us are prepared to wager our professional reputations by advising a CEO to allocate millions of dollars of investor capital into the actionable insights that we are recommending. In the actual practice, we would be recommending finding alternate sources of similar data sets to verify these conclusions. In addition to our newfound perspective on the practical values of data mining, we are now prepared to temper future data sourced predictions with a managerial “P-Value”, named the “Group 6 N-Value” to represent common sense and intuition. We therefore recommend, that when proposed data sets lead us down a path of  assumptions based on high P and Adj R sq values, but contradict our own personal “N-Values”, we should first pursue additional data sets and alternate models to demonstrate, without doubt, that those high statistical probabilities are indeed replicable and justifiable in the abstract science of strategic management and consumer behavior.

Board Independence is Less Effective at Deterring Accounting Fraud in Family Controlled than in Publicly Held Corporations

An Annotated Bibliography by Todd Benschneider

Prencipe, Annalisa. Bar-Yosef, Sasson. “Corporate Governance and Earnings Management in

Family-Controlled Companies.” Journal of Accounting, Auditing and Finance. April 2011,

Vol. 26 Issue 2, p199-227. 29p. Database: Business Source Alumni Edition.

Annalisa Prencipe, PhD. and senior lecturer at SDA Bocconi School of Management with her team of researchers conducted a study of 249 firms to compare the quality (long-term sustainability) of profits in family controlled firms to earnings of publicly held companies. The study investigated the impact of “earnings management strategies” a term that The Journal of Accountancy defines as “the discretionary distortion of revenue, expense and depreciation schedules to optimize short term goals such as executive bonuses, budget targets or manipulation of stock prices.”  The results of the study were intended to provide accounting firms with new tools for identifying ratios and patterns that detect shareholder fraud in family controlled firms.

In publicly held firms strong incentives such as performance bonuses, performance reviews and salary bonuses lure executives to portray company financials in the most positive light, while concealing negative information from financial reports. However, over reporting earnings provides inaccurate feedback to the product development, finance and marketing departments who rely on accurate reporting to steer future products and operations strategy. Extended periods of inaccurate market feedback can undermine the long term economic health of the company. Stockholders can reduce mismanagement by electing an independent board of directors who hire, evaluate, supervise and fire top level executives to ensure that strategic decisions represent the shareholders’ best interest.

Prencipe explains that “A typical board structure is composed of outside directors and top company officers. Outside directors are appointed by the company’s shareholders and are assumed to be acting in the shareholders’ interests. However, the inclusion of top management among board members may give rise to a conflict of interest as management may attempt to transfer wealth from stockholders by taking advantage of information asymmetry. The results show that the increase in shareholder wealth is significantly higher when the board is dominated by independent directors.”

Recent trends in corporate governance now encourage firms’ directors to enforce accurate financial reporting. Board oversight can identify executives who exploit short range strategies that inflate profits to capitalize on performance bonuses. By the time the earnings management schemes unravel, the executives involved have often retired or moved on to other companies, which limits the legal recourse available to the stakeholders. Public demand in response to recently publicized investor fraud cases have prompted legislators to issue regulations that hold board members accountable to shareholders for fraudulent reporting of the executives they oversee. Regulatory changes in corporate governance have been eliminating the participation of company executives from the board of directors to reduce their influence over the boards’ objectivity, especially by eliminating CEO’s from also serving as the Chairman of the Board.

However, family controlled companies face different incentives to publish inaccurate financials, and further compounding the distribution of power, the CEO is often times also the largest stockholder of the company, entitling them to serve as the Chairman of the Board.  Prencipe wrote “Current literature suggests that, although founding family ownership seems to be associated, on average, with higher earnings quality, the extent of earnings management remains an open issue for family controlled firms. Since most families with controlling interest in their company possess a long term vision for growth and therefore make decisions that favor long range goals rather than boosting quarterly profits.”

Prencipe believes that while experts agree that there is less incentive for family controlled firms to over report earnings, that instead those companies manage earnings to secure the family’s controlling interests, minimizing the distribution of wealth to minority shareholders. She hypothesized that recent corporate governance restructuring would be less effective in family controlled companies whose self-interest lies in underreporting earnings, especially present in where the family also served in salaried executive positions by increasing family members bonuses or siphoning private benefits at the expense of other shareholders such as supplier kickbacks, travel expenses and other concealable business write offs.

The study was expected to validate previous research that had shown a lower incidence of earnings management under a board of directors with independent decision making authority, especially those boards lacking a CEO chair holder.  A board possessing low levels of independence has many of the company executives voting on board decisions, with the CEO also serving as the chairman of the board. In cases of a highly independent board the CEO does not hold a seat and possesses only subordinate levels of authority in regulating corporate accounting. However this study would specifically compare results from widely held public corporations against those from private firms and measure the estimated earnings management strategies present in the financial reports. Levels of earnings management in the companies would be calculated from a fraudulent accounting indicator: abnormal working capital accruals (AWCA).

Prencipe and Bar-Yosef conducted a study of Italian corporations by applying AWAC audit calculations to a sample of 249 Italian corporations consisting of four publicly traded corporate governance structures:

1-      Family Controlled with CEO on the Board of Directors

2-      Family Controlled with no executives on the Board of Directors

3-      Publicly Held with CEO on the Board of Directors

4-      Publicly Held with no executives on the Board of Directors

The intent of their study was to see if a correlation could be found that suggested that any of these four governance structures yielded a higher quality long range financial growth. The results validated several previous studies that found higher quality earnings generated by publicly held corporations with a highly independent board of directors. The results also supported Prencipe’s hypothesis that family controlled firms outperformed publicly held firms in earnings quality; however there was a less pronounced advantage to private firms with a highly independent board when compared to public firms with an identical governance structure.

Prencipe’s closed her article with:

“Our conclusions may lead regulators and academics to reevaluate the effectiveness of some corporate governance models when applied to family controlled companies. In particular, our results suggest that regulators should pay special attention to the selection of board members. For the benefit of all shareholders, it is important to guarantee substantial independence of the board. Our results are also useful to users of financial statements, suggesting that a company’s ownership structure and its corporate governance characteristics should be taken into account when accounting numbers are used.”