Exploratory Study on Anchoring : Fake Vote Counts in Consumer Reviews Affect Judgments of Information Quality

Popular products in online stores often have overwhelming numbers of reviews. To help consumers identity quality reviews, (Site 1) crowdsources this decision by asking consumers to vote on the helpfulness of reviews. Many studies assume these votes reflect the information quality of a review, but this does not account for the influence of fake votes. This study investigates whether fake votes influence judgments of review information quality. From controlled questionnaires given to 294 consumers, we found that fake vote counts could completely reverse judgments of information quality in both directions. Specifically, fake votes affected consumers’ processes of purchase decision-making and product research. Overall, fake votes changed both review judgments and purchase behaviors.


Introduction
Online consumer reviews (online reviews or reviews hereafter) offer great value to customers [52].About 70% of Americans consult reviews before making a purchase decision [4] and 95% of U.K. consumers always or sometimes consult reviews before purchasing online or in stores [66].These statistics show that online reviews are critical in consumers' purchase process.
Because reviews are important, review manipulation is becoming popular.A recent Gartner report [77] noted that 2% to 6% of ratings and reviews are deceptive and that an industry of fake reviewers is emerging with a price range of $1 to $200 per review depending on the products.Other algorithms and techniques have estimated that about 2% to 10.3% online reviews are fake [33], [45].
While assessing the content of a review may tell us if it is fake, detecting fake review ratings is much more challenging because a rating does not leave any traces of author authenticity.Because a manipulated rating gives a false signal to the information quality (IQ) of a review, it is important to understand the extent of influences by rating manipulations.Some websites give users the ability to rate the quality of reviews.Amazon.com(Site 1), for instance, allows consumers to vote on whether a review is helpful or not (e.g., 12 of 20 people found the following review helpful) as shown in Figure 1.These helpfulness votes (H-Votes hereafter) and their ratios (the total YES votes divided by the total votes) are used to sort reviews in order of review helpfulness.In addition, Amazon.com(Site 1) prominently displays the two most helpful positive and negative reviews for each product.Previous studies on consumer reviews focused on the relationship between content characteristics and content authenticity (e.g., [32], [53]) or between content characteristics and H-Vote ratios (e.g., [41], [52]).Yet, these studies regard H-Votes as authentic.Some studies assumed consumers used their own judgment to assess the review information quality (IQ) regardless of the number of H-Votes.However, the anchoring effect [80] can affect how consumers interpret existing H-Votes.That is, consumers may be more likely to rate a review as helpful when they see a higher H-Vote ratio than a lower one.If so, the existing H-Vote ratio would influence subsequent votes in a cascading manner.Thus, our first research question is to what extent does the anchoring effect exists in H-Vote ratios?
Anchoring may not only influence the moment of judging review IQ.It may also affect the product learning process that consumers typically go through before making a purchase decision.In this paper, we refer to product learning as the exploration of product functionality and other aspects, equivalent to the intelligence and design phases in Simon's decision-making process [70].In contrast, purchase decision-making refers to the choice selection and purchase-making phases of his model.Thus, reviews may provide value in long-term learning and in short-term decision-making.For instance, one study found that the average consumer spent 6 hours and 21 minutes learning about a product online, even though they need just 6 minutes to shop at retailer websites [66].A recent survey notes, "one in two respondents spent 75% of their overall shopping time researching product (sic) as compared to just 21% in 2010" [64].Thus, our second research question is to what extent does anchoring affect learning and purchase decision-making?
Assessing how anchoring affects H-Vote ratios is important for four reasons.First, previous studies rarely examined the anchoring effects on H-Vote ratios.Second, if anchoring is confirmed on H-Vote ratios, how does it influence consumers helpfulness voting behavior?Third, such anchoring may depend on the type of good and review.Fourth, anchoring should be examined for both short-and long-term effects because consumers use reviews to purchase at some later time.
The next section surveys the theoretical background pertinent to our research questions.We then set forth hypotheses, followed by methodology and results.Finally, we discuss the implications of our study, the future research agenda, and the consequences of our conclusions.

Theoretical Background
IQ indicators of consumer reviews.Existing studies examine the quality of reviews by using two approaches.In the first approach, researchers ask which content characteristics indicate the IQ of helpful reviews.The second approach assesses the authenticity of reviews.
To consumers, one of the most visible IQ indicators is the count of YES/NO votes of how many other consumers find a review helpful.This count gives an H-Vote ratio, or summary IQ indicator.Existing studies use these votes as the most critical IQ judgments from consumers.Assuming the votes are authentic, these studies examine what characteristics highly helpful reviews have in common.They have identified three basic IQ indicators along with three circumstantial factors (Table 1).
The three basic IQ indicators of a review are its length, readability, and meaning or substance.Review length is significantly related to IQ because helpful reviews usually use more words to describe the product.Quality reviews are also easy to read: by combining four readability indices to assess the complexity of texts and ease of reading, Korfiatis et al. [41] found that helpful reviews were more readable than other reviews.A review's substance or meaning also influences the review IQ, though determining "the exact meaning of text is extremely difficult and often subjective" [14] p. 515.Using latent semantic analysis (LSA), Cao et al. [13] found that the variety, frequency, and pattern of words (i.e., concepts) used in a review suggest its substance.In addition, Pan and Zhang [60] reported that the innovativeness of a reviewer affected the review IQ, with moderate innovativeness being better than extreme.
Reviewer credibility and reputation have also been reported to be factors in review IQ [5], [44].
In addition to the three basic IQ indicators, circumstantial IQ factors such as goods type, valence, and total votes also influence the perception of review IQ.The goods type classifies a product based on whether it is feasible to evaluate its quality before or after using it.Some products have tangible characteristics that can be easily evaluated before purchase (e.g., electronics), while others have subjective characteristics among users (e.g., music, wine).Some products claim to have characteristics that can only be evaluated after long-term usage (e.g., vitamin supplements) or in special circumstances (insurance).Valence can also influence review IQ.A product review with a real name and/or verified purchase is usually considered to have higher IQ than similar reviews without those attributes.Finally, the total vote count affects IQ because it is related to the number of shoppers who have read a review, making it a circumstantial indicator of the review's popularity, which directly relates to its quality.
The second approach to review IQ is to question if a review is fake or not.Much research has been done to optimize spam-detection algorithms and models (e.g., [38], [46], [48]).However, detecting fake reviews requires detailed data on the review's posting date, the reviewer's ID, and the textual characteristics of reviews.Average consumers have no easy way to tell which reviews are fake, even when reviews have some suspicious characteristics (e.g., multiple reviews on a product from the same reviewer, use of extreme language, unfair views on product features).
In theory, consumers have they need to judge the IQ of a review.They can consider its length, readability, and substance: the three basic IQ indicators.However, this judgment does not seem simple because the basic IQ indicators that make a review helpful vary by the good type and review valence.Also, consumers can see how others judged the same review, which is the summary IQ indicator.
Consumer decision-making.During the last few decades, consumer decision-making research has grown significantly, producing a number of new decision theories and models [9], [73] that depend on consumer profiles (e.g., demographics), product categories (e.g., utilitarian vs. non-utilitarian goods), and purchase types (e.g., in-store vs. online purchase).Smallman et al. [74] summarizes the consumer decision-making paradigms: from prescriptive, analytical decision-making to bounded rationality, adaptive decision-making, and more recent pragmatic and naturalistic decision-making.We limit the scope of the present literature review to the issues associated with online consumers, primarily regarding online purchase decision-making.
Several studies have proposed models of online consumer decision-making.Two studies [39], [88] offer consumer motivation models that incorporate the technology acceptance model [18], the theory of reasoned action [26], and/or With the emergence of online searching and purchasing, the cognitive efforts of consumers are now supplemented by "electronic decision aids" [7] such as H-Votes, star-rating summaries, and prominent displays of the most helpful reviews.However, the effectiveness of these electronic decision aids is somewhat inconclusive, according to empirical studies [29].A common decision aid is the product recommendation system, now used widely at retailer websites.For example, (Site 1) displays recommended products-based on a customer's profile, purchase history, and other factors-using sophisticated algorithms such as collaborative filtering and cluster modeling [49].Such tools go hand in hand with the customers' process of learning the true quality of a product.This leads some research to focus on how customers learn (e.g., [69], [90]).Many electronic decision aids include review quality or helpfulness in their recommendation factors.However, helpfulness votes can be manipulated, and these fake votes can add up to change the overall opinion [65], [89].The influence of fake IQ indicators has been understudied, but they are becoming increasingly importance in consumer learning, so they should be examined in the context of reviews.
Anchoring-and-adjustment heuristic.Judging the IQ of a review is not a simple, objective task because of the varying-and often frequently complex and subjective-product characteristics and uncertainty in review authenticity, as noted earlier.Under these circumstances, consumers often rely on other consumers' judgments as anchors to start their evaluation and adjust their own judgments as needed.This phenomenon is known as the anchoring-andadjustment heuristic [24], [82].According to Tversky and Kahneman [80], individuals make estimates by starting from an initial value and adjusting it to reach their final answer.Offered different starting points, individuals anchor their final estimations toward those values, leading to biased outcomes.
Anchoring is well studied and often used in marketing, especially in pricing tactics [3], [30], [68], [81] such as price comparison, everyday low pricing, and reference price anchoring.This type of anchoring is commonly known as numerical anchoring, which is defined as "the assimilation of a quantitative estimate towards a previously presented number" [67].Numerical anchoring has been examined, producing various findings: numerical anchoring is stronger under time pressure [43], subliminal anchoring exists [67], and anchoring depends on access to other considerations [27].Also, when anchors are placed outside the plausible value range, their effects diminish [83], [84].Recent studies [10], [27], [84] also consider the elaboration likelihood model (ELM) [62], [63] and how anchoring affects decision-making in situations that require higher or lower cognitive loads.Numerical anchoring affects decisionmaking both in relatively thoughtful, high-elaboration processes and in relatively non-thoughtful, low-elaboration processes.Interestingly, when asked to assess the price a second time, individuals in high-elaboration processes adjusted less than those in low-elaboration processes [83].
Most research has focused on numerical anchoring in pricing policies, and few studies have explored how anchoring can influence judgment of review IQ.In consumer reviews, some IQ indicators are numbers, while the others are not numbers.The summary IQ indicator, in particular, is a ratio of two numbers: YES votes and total votes.Thus, from an anchoring perspective, consumer reviews pose a unique challenge to investigate the influence of anchoring.
Anchors, the data giving an initial reference point for a judgment, are likely to influence consumers.Similarly, summary IQ indicators (H-Vote ratios) may act as anchors, not only in individuals' judgments of review helpfulness, but also in their learning process of the product (as delayed anchoring effect).This leads us to our research objective of studying how anchoring from fake summary IQ indicators influence consumer reviews at two levels (Figure 2

Hypotheses
These study addresses two basic questions.First, does anchoring change consumer IQ judgments when consumers vote on the helpfulness of a review?Second, how does anchoring influence the two latent foci (learning and decision-making) differently as consumers rate review IQ?While more than one indicator affects review IQ, (Site 1) most helpful reviews (Figure 1) enables us to control the three basic IQ indicators and total votes.Because these most helpful reviews are given for a specific product and review valence, we can examine our research questions for specific review valences and product types (Figure 3).In the following, we first define product types and then elaborate our hypotheses that address these two basic questions.

Search Experience and Credence (SEC) Goods and Decision Anchoring
Consumers evaluate review IQ differently based on goods type.While there are several ways to categorize goods, our research categorizes them into SEC goods.Search goods are those whose quality can be evaluated prior to purchase or use.Experience goods are those whose quality can be examined after use [55].Credence goods are those whose qualities cannot be evaluated in normal use even long after their purchase [15].We use these three types for the following reasons.
Extant studies [31], [35], [52], [54] on online consumer behaviors and reviews frequently select products based on SEC categories.The searchable characteristics of search goods are particularly relevant in our study because reviews of search goods should be based on objective claims about their tangible attributes [52].If the claims made by a review are objective and conceivably credible, there may be less room for manipulation.In contrast, the quality of experience goods depends on both objective and subjective experiences [86].Thus, advertisement more strongly affects the perceived quality of experience goods than search goods [56].The quality of credence goods is even more difficult to evaluate and thus may be even more easily influenced by advertisement.Because helpfulness voting is similar to advertisement in how it influences consumers' evaluation of review quality, we use this classification method to examine how anchoring affects the three goods types.

Does Anchoring Exist in YES/NO Votes?
For a search good, consumers appreciate a review more when it specifically addresses the tangible product characteristics: for example, This printer took just 1 minute to print 80 pages, and the prices of similar printers are at least $50 higher than this printer.In contrast, credence goods have few tangible benefits that can be easily evaluated, so consumers might appreciate reviews that addresses reasons why they should avoid buying the product: for example, I had an allergy to this vitamin.Consumers also adopt a decision strategy that varies depending on the goods type.For example, when a consumer evaluates search products, it is easier to use a compensatory strategy; when evaluating a credence good, a non-compensatory strategy might be more effective [20], [71], [79].Table 2 shows the preferred content and decision strategy for each goods type.A compensatory strategy uses a weighted-additive rule for the benefits of search goods.In this approach, a consumer first assigns a weight to each attribute, finds a total score for each product based on the sum of the weighted attribute scores, and then selects the product that has the highest total score [10].This strategy is easy to use for search goods whose attributes are tangible.It can also be applied to experience goods because consumers can add up the qualitative positive experiences of other consumers.Note that most online reviews are not well structured: these reviews are considered an informal form of knowledge sharing [61].Thus, consumers must mentally construct a compensatory assessment of attributes from the review.The content of such a review is then subject to anchoring from the summary IQ indicator.
A non-compensatory strategy utilizes elimination-by-aspects (EBA): If a product fails to satisfy particular requirements, it is dropped from consideration.This process is repeated until one product remains [79].Consumers can use EBA even without complete information about products' features and functions.EBA is especially useful in evaluating the quality of negative reviews.When it comes to negative reviews, anchoring may sway review judgments more for experience and credence goods than for search goods.Compensatory and non-compensatory strategies are applicable and valid for consumer decision making, remaining robust even in recent empirical studies [37], [58], [59], [78].
Overall, when consumers read positive reviews of products whose attributes and benefits are tangible, these review can give them a list of clear reasons to buy the product or not.Similarly, when consumers read a negative review for products whose attributes or benefits are unclear but whose descriptions in the review are damaging can give them a clear signal to stay away from the product.Thus, we hypothesize:

Do the Two Hidden Foci of Review Helpfulness Exist?
Consumers buy a product based on two sub-processes: learning and purchase decision-making [76].
Learning is defined as "the vicarious acquisition of cognitive competencies that consumers apply to purchase and consumption decision-making" [13].Decision-making is the next logical step of product selection, in which consumers make a purchase decision.Consumer learning is a central construct used in consumer behavior models [36], but psychological decision research tends to neglect the influence of learning [8], [9].For example, it was not until recently that some studies examined the influence of learning in online environments (e.g., [13], [89], [90]).
Reviews could help consumers in both learning and decision-making.These two foci are intimately related-the same review may help in completely different ways regarding these two foci, according to different consumers or even the same consumer in different sub-processes depending on context-so the consumer must decide on a review's helpfulness in both of these aspects [57].
Additionally, the characteristics of SEC goods may affect consumers' perceptions of helpfulness differently in learning and decision-making [17].Search goods have attributes and functions that are easier to verify, so consumers of these goods-whether they are in the learning or decision-making sub-process-could easily grasp the review content according to their needs.In contrast, credence goods have attributes that are very difficult to verify, so consumers engaging in either learning or decision-making would be challenged in clearly differentiating the helpfulness of either aspect.Thus, we hypothesize the following: H2a: Consumers will perceive the helpfulness of a given review differently based on differences in their needs related to their learning and decision-making sub-processes.
H2b: Depending on the goods type, consumers experience different levels of difficulty in determining the helpfulness of a review's content in learning and decision-making because of differences in how difficult it is to verify that content.The order of difficulty from easiest to hardest is search goods, experience goods, and credence goods.

How does Anchoring Affect the Two Hidden Foci Differently?
The next logical question is this: how does manipulation of review helpfulness voting affect the perception of learning and decision-making based on review content?In other words, how does the anchoring effect in helpfulness voting affect these two foci?
The IQ of a review depends on (a) learning values, (b) decision-making values, or (c) both learning and decisionmaking values.When consumers vote YES or NO on whether a review is helpful, their decisions depend on the weights of the two foci.Some consumers may vote YES based mainly on whether the review helps them decide whether they should buy the product (decision-making).Some vote depending on how much they learn from a review (learning).Other consumers use weighted averages of these criteria for their judgment of review IQ.
The purchase decision-making sub-process is dichotomous (to buy or not to buy), while the learning sub-process is multidimensional and based on key product attributes.Learning could be evaluated with a graded scale such as the Likert scale, similar to valuation using choices with multiple-bound questions [6].Previous studies [19], [85] indicated that preference distributions are usually different between the dichotomous mode and the multiple-bounded, discrete-choice (MBDC) mode.Thus, we expect that anchoring will have different effects in learning and in decisionmaking.
The normative decision-making model says that anchoring is used when consumers must engage in conscious value-judgment [11].Such behavior is likely to occur when consumers make a purchase by adding up the tangible benefits noted in positive reviews of search goods.While reviews for experience and credence goods have more subjective content, their positive reviews can also provide pleasant experiences as scalable units of qualitative benefit, which consumers would add up in their minds.
The learning aspect of positive reviews for search and experience goods affect potentially many cues with respect to anchoring.However, negative reviews of experience and credence goods involve the limited number of cues for decision-making using EBA.Because these goods usually have attributes that are intangible or difficult to verify, the anchoring effect between learning and decision-making foci is likely obfuscating to consumers.Thus, we propose these hypotheses: Our theoretical model and hypotheses is summarized in Figure 4

Method
To test the hypotheses, we set up an online survey questionnaire that gives reviews close in appearance to what consumers see on Amazon.com(Site 1).Using the most helpful reviews (as seen in Figure 1), we manipulated the vote ratio of each review in three different ways (as three treatment conditions): a perfect ratio (100% H-Vote ratio), half that ratio (50%), and visible no ratio/votes.Our survey presented each participant with the same three reviews, but with a treatment condition randomly selected from the three (Figure 5).(Site 1), the most helpful positive reviews often have well above a 90% H-Vote ratio, while the most-helpful negative reviews commonly have around 70%.We chose a 50% ratio for the middle condition because it is lower than these typical H-Vote ratios but not as extreme as a 0% ratio.After showing a review with a manipulated anchor, we had each participant evaluate the review for (a) its helpfulness with a YES/NO choice, (b) its helpfulness in learning on a 5-point Likert scale, and (c) its helpfulness in decisionmaking on a 5-point Likert scale (Appendix A).The reviews were manipulated without affecting the font size or display position of the H-Votes.Prior to administering the questionnaire, we conducted a pilot study to test the validity of the instruments with 108 students from two Midwestern and Southwestern universities in the United States.We used the same survey platform and variable constructs for the current study.In particular, we tested a review on a vitamin supplement to assess whether most participants have adequate knowledge and experience on such a good.The remarks from the pilot study participants confirmed that they were able to follow the survey questionnaire without confusion.The spread of their answers agreed with our expectations as to which levels of numerical anchors should be used.
Selection of products and their reviews.Two products were selected from each category of search goods, experience goods, and credence goods.We randomly picked them from the first page of (Site 1) Best Sellers in the Printers (search goods), Music CDs (experience goods), and Vitamins and Supplements (credence goods) sections in January 2012.The intent of using only two goods is not to investigate exhaustively all kinds of goods, but to dispute that no anchoring effect is seen across all goods.Printers have the general characteristics of search goods, such as hardware specifications and measurable performance characteristics.Music CDs are considered experience goods because their value depends on personal subjective tastes and is hard to estimate prior to purchase.These two goods are used as search and experience goods in a recent study by Mudambi et al. [52].
Vitamins and supplements are considered classic credence goods [15]: their effect on heath is not apparent even after extended use.For those six goods, we used a pair of the most helpful positive and negative reviews displayed by (Site 1) at the time.Table 3 summarizes the key characteristics of the selected reviews.Data collection.The data were collected from a survey questionnaire in 2012.We solicited volunteers via (Site 2), offering a $5 (Site 1) gift card for completing the questionnaire.We also solicited volunteers at the two universities where we conducted the pilot study, offering the same incentive.Because online questionnaires may collect multiple or automated entries, we screened invalid survey entries with the following criteria: (1) The participants must be 18 years or older and U.S. residents; (2) The IP addresses of participants must be verifiable U.S. IP addresses (no proxy servers); (3) The participant's name, email address, and mailing address must have some indication of validity by matching name and email address as well as name and address via (Site 3); ( 4) The text comments must be in English; and (5) All the questions are answered fully.In the end, we collected 294 valid entries.
Sample.We drew survey participants from two universities (n = 125) and the general public (n = 169).We used ANOVA (Analysis of Variance) to see if there were any statistical differences in how the two sample groups cast H-Votes.The results of ANOVA (dependent variables: H-Votes for the 6 goods, factor: the two groups) were not significant  Variables.We gathered H-Votes by asking the following question: Do you find the above review helpful? and prompting a YES or NO answer, just as (Site 1) does.The two latent foci of H-Votes (learning and decision-making) were measured with the question, To what extent do you agree: the above review is helpful to me to LEARN (or MAKE A PURCHASE DECISION) about this product, which prompted a response on a 5-point Likert scale (strongly disagree to strongly agree).For control variables, we used consumer age, ethnicity, income, and use of mobile phones for shopping.Table 4 shows the categories of age, ethnicity, and income, among others.We assessed consumers' use of mobile phones for shopping by asking the question, I always use a mobile phone before or during shopping to get product/vendor/retailer information, prompting a response on a 5-point Likert scale (strongly disagree to strongly agree).

Results
We summarize the results based on our three hypothesis pairs.

Influence of Helpfulness Voting Manipulation on Subsequent Votes (H1a and H1b)
To test the influence of helpfulness voting manipulation on subsequent votes across SEC categories or anchoring effect (H1), we used cross tabulations to account for the treatment conditions and consumer profile.The commonly used guidelines for cross tabulations [51] p. 626, [87] p. 734 are: (1) No more than 20% of the expected counts in cells are less than 5; and (2) All expected counts are 1 or greater.To minimize Type II errors (false negatives), we devised two-step crosstab analyses.First, we divided each consumer profile variable into upper and lower halves using the median (e.g., age, income), or into the majority and the non-majority (e.g., ethnicity).There are three treatments, so the number of crosstab cells was 2×3=6.Second, we conducted crosstab analyses with all possible pairs of the three treatments.By using 2×2 crosstabs, we maximized the number of statistically significant results that met the crosstab statistical requirements.
As shown in Table 5, anchoring appeared for at least one product in all the studied categories (Appendix B lists statistical details [1], [91] for Table 5).For positive reviews, anchoring effects appeared in 3 out of the 4 products (3 significant items for 3 goods in cells 1 & 2).For negative reviews, anchoring effects appeared for at least one product in each good type (6 significant items for two products in cells 5 & 6).Some consumers reversed their *: α = .10,**: α = .05,***: α = .01

Decision-Making and Learning are Different (H2a and H2b)
The variables for the degrees of helpfulness in learning and decision-making were statistically non-normal.Thus, we used Wilcoxon signed-ranked tests to examine their statistical differences (Table 6).These non-parametric ANOVA tests indicated that the two foci were different across all the SEC categories (i.e., each good in Table 6 has at least one significant z-value).This result supports H2a.H2b is supported by the trend in differentials, which increased from credence goods (one in four cases or 25% at α = .05)to search goods (three in four cases or 75% at α = .05),as shown in the rightmost column of Table 6.
Table 6: Differentials between learning and decision-making helpfulness for the same review

Influence of Helpfulness Voting Manipulation on Two Didden Dimensions (H3a and H3b)
Anchoring effects on learning and decision-making (H3) were tested with non-parametric one-way ANOVA tests, also known as Kruskal-Wallis tests.As shown in Table 7, there were significant anchoring effects in the two foci, learning and decision-making.(The details of these statistical results are in Appendix C.) Among the positive reviews for search goods and experience goods (cells 1 & 2 in Table 7), the decision-making (D) focus had 6 instances of anchoring, while the learning (L) focus had only 1 instance of anchoring.(Hereafter, we refer to instances of anchoring in decision-making and learning as D's and L's, respectively.)This result supports H3a (many more D's than L's).For the negative reviews of experience and credence goods (cells 5 & 6 in Table 7), we found 5 D's vs. 4 L's for experience goods and 0 D's and L's for credence goods.These results support H3b (similar number of D's and L's).

Discussions and Implications
Most importantly, we found the anchoring effect in consumers' evaluations of review IQ in (1) the YES/NO questions and (2) the grading scale for learning versus buying.The extent of anchoring depended on the review valence (positive or negative), SEC category (search, experience, or credence good), and consumer profile (e.g., age, gender, ethnicity).Anchoring can manifest instantly with purchase decisions or can surface over time through cumulative product learning.Table 8 summarizes the results of hypothesis testing.supported H2b Their differences increase in the order of search, experience, and credence goods.supported H3 How do fake summary IQ indicators affect the two hidden foci differently?H3a They affect decision-making more than learning for positive reviews of search and experience goods.supported H3b They have a similar effect for negative reviews of experience and credence goods.supported Practical consumer implications.Consumers are frequently not experts in purchasing high-tech electronics (search goods) and vitamin supplements (credence goods).This is partly because consumers purchase these goods relatively infrequently, partly because the goods often have complex attributes, and partly because judging a good can be challenging.As an example, some reviews claim that a printer from a given manufacturer suffers from paper jams.However, verifying this claim is not easy.Well-known online retailers offer many products in a given category to choose from; consumers typically see the products having mostly decent star ratings; and popular products have many reviews.Given these conditions, consumers often turn to the most helpful reviews prominently displayed on the first review page.This is perhaps the most common shopping experience among today's consumers.
As consumers read reviews and look for compelling reasons to buy a product, anchoring from H-Vote ratios-even though it is just one piece of information-can make a review look better or worse than it really is (H1a).Likewise, when consumers read reviews to find reasons against buying a product, anchoring can make the same review more or less convincing (H1b).Additionally, anchoring works differently in different demographics, including age, gender, and race (Table 6).
Consumers generally find a review beneficial either because it helps them learn more about the product or helps them decide to purchase it (H2a).For example, a consumer looking to buy a color laser printer or vitamin supplement would probably learn more about the product as they continue to research it.For goods like vitamins, further research may not help the consumer determine which product to buy; attempting to learn about these goods is likely futile.For products with tangible qualities, further research can make consumers more confident in their buying decisions.Learning about these goods is likely productive, and in this case a spread in review IQ is likely present between reviews which educate consumers and reviews that encourage them to buy the product (H2b).As a product's price increases, consumers become more likely to spend some time learning about the product, so teaching reviews are more important for expensive products than cheap products.
A given positive review with a fake high H-Vote ratio is more convincing than an identical review with a fake low H-Vote ratio, and this tendency is even stronger when the reviewed product has tangible or subjective benefits (H3a).
The sensitivity to fake votes also depends on the consumer profile (Table 8), probably because each product has its own target audience.However, fake votes do not affect negative reviews of products like music albums and vitamins (H3b).In this case, someone's dislike does not equate to a potential buyer's dislike when it comes to subjective goods such as music albums and wine.
The good news is that fake helpfulness votes only fool certain consumers.Even so, consumers should be aware of fake votes because sellers can target consumers based on their profiles.
Detecting anchoring using the Guttman scale.The H-Votes are based on the binary mode of YES or NO, a classical Guttman scale [28].Such a scale simplifies the response process, but it may oversimplify it [75].For instance, a consumer leaning toward YES (or NO) who is subject to some anchoring may still give the same vote because the response scale hides the change in opinion.Thus, detecting anchoring effects is harder with Guttman scales than with Likert scales.Considering this possibility, the nature of support for H1a and H1b should be taken as the results obtained in the most challenging circumstances that have statistical significance.The anchoring changed consumer perception by only about one point when the reviews were evaluated based on objective criteria.When consumers read negative personal stories about experience and credence goods, though anchoring affected these goods more than search goods.That is, manipulations of H-Votes can more significantly affect negative reviews of experience Meaning of the two hidden dimensions of helpfulness.Consumers generally perceive different values for learning and decision-making from the same review.This study used 12 most helpful reviews to test the hypotheses.If a review is extremely helpful or poorly written, the helpfulness for these two foci would most likely be similar because the scales of both foci would be impacted by their extremity.In this sense, we used the most challenging reviews to test the hypotheses.Yet, the two different foci are evident in the results.As hypothesized, products with tangible attributes produce more differences between learning and buying, while products with subjective attributes or unclear benefits produce fewer differences.The negative z values indicate the helpfulness for learning is higher than that for decision-making.This means that consumers are appropriately using more stringent criteria to judge the review helpfulness for deciding to buy a product than for learning about it.The learning focus has cumulative effects over time, while the decision-making focus has immediate effects on purchase behaviors.Thus, we argue that researchers should treat these foci differently.For some goods, one focus may be more important than the other.
Complex goods, such as digital cameras and high-end electronic products, likely require more product learning unless the consumer is an expert on such goods.In contrast, frequently purchased goods, such as printer toner and popular dietary supplements, likely require a focus on decision-making.Product/marketing managers should incorporate the characteristics of the two foci in their product marketing strategies using reviews.
Consumer demographics and anchoring.A key message from Tables 5, 6, and 7 is that anchoring is specific to consumer demographics.For example, fake votes on a positive review for one music product tricked one ethnic group but not the other, but for a second music product, fake votes affected a third ethnic group.Some consumers can process more reviews than others in the same amount of time [42], and consumers with certain profiles find more value in some reviews over others.Anchoring in valuing reviews can manifest in present and future purchases specific to consumer demographics.For product marketers, this exploratory study demonstrates that is it possible to attract certain customer segments by having reviews favoring their products with falsely high H-Vote ratios.The customers with certain profiles are more susceptible to manipulated numbers.
As consumers, we should maintain a healthy skepticism of the evaluations we see for reviews.For example, certain types of music albums are liked by consumers with specific profiles.At the same time, these customers might be even more enthused with these albums after seeing artificially positive scores on reviews that strongly favor the albums.The positive biases of reviews are noted even in USA Today: "At one time on TripAdvisor, a travel review site, there were more than 50 million reviews and an average rating of 3.7 stars out of 5. Pretty high" [47].
Concerning travel reviews, one reporter commented, "People don't necessarily expect the truth when they click on a review site.Truthy is good enough" [21].Consumers have good reason to be aware that anchoring can cloud their purchase judgments.
Tradeoffs in judging reviews with YES-or-NO scales.As researchers, we should be mindful that review evaluations are not objective.Just like public opinions aroused or swayed by a few vocal groups and entities, a review's helpfulness could be manipulated just the same.In particular, the Guttman scale commonly used in rating the helpfulness of reviews should raise interesting debates.As Masoff [50] discussed extensively, the Likert and Guttman scales have different assumptions and aims.The two latent foci of review helpfulness follow non-normal distributions.The star rating of products is known to follow a J-distribution [34].Thus, not using an interval scale may have been an effective decision for (Site 1).At the same time, our study revealed the hidden values of reviews beyond the YES/NO basis.Overall, there seem to be theoretical and practical tradeoffs when choosing a measurement scale for review helpfulness.

Managerial Implications
Identifying fake reviews and other review manipulation is a constant battle for online vendors with both economical and legal implications.If they cannot effectively control review manipulation, consumers will eventually lose trust in reviews.In certain circumstances, tolerance of such behavior may be seen as collaboration and lead to class lawsuits.Either situation would damage a company's reputation and profits.
Our study shed some light on how manipulation affects online reviews.First, we confirmed that anchoring affects the process of rating online reviews for helpfulness, a warning to anyone who plans to use such an IQ rating as a nonbiased input in other review data-mining.
Second, we revealed that different product categories are susceptible to different manipulations of the helpfulness rating.Thus, online vendors should spend time designing measures to counteract or mitigate the influence of anchoring in specific product categories.

Limitations and Future Research
This study has a few limitations.The survey questionnaire was administered in 2012, however we have not seen significant changes in how reviews are rated at most websites.We did not consider consumers' actual purchase histories and product expertise.We used ordinary, common products with which many consumers likely have at least some purchase experience and knowledge.Future studies may want to explore the relationship between actual purchase records and the anchoring effect.A neurocognitive or neuromarketing, [12], [25] approach could advance beyond the findings of study.
Reviews of subjective products, such as music albums, were affected by fake votes differently depending on the consumer's age, gender, and ethnicity.This result is intuitive because we know different consumer groups prefer certain types of music.Using larger samples, we could refine the findings of this study.Future studies should also look into the sensitivity analysis between anchoring, product interests, and consumer profile.
This exploratory study used 12 reviews of bestselling products as benchmarks, but they may not be sufficiently representative.To increase the variety of reviews, future studies could sample more products in each SEC good category to ensure the results are statistically sound.
Another rewarding direction would be to investigate the relationship between the anchoring effect and the textual characteristics of reviews.For example, future studies could classify reviews by their readability and their proportions of objective/subjective content, then evaluate the influence of anchoring.Other possible research topics include studying how anchoring affects repurchase intention and perception of product quality.

Conclusion
Popular products in online stores often have overwhelming numbers of reviews.One simple way to aid consumers in analyzing reviews is to allow them to vote YES or NO on whether a review is helpful, collecting the votes to signal the quality of the review.However, online reviews are often manipulated, so how do these fake votes affect consumers?To clarify this, we asked two research questions: First, to what extent do fake votes anchor consumers' judgments of review IQ? Second, do fake votes affect both review judgments and the process of researching product and deciding to purchase it?
This exploratory study is one of the first to report that some consumers, anchored by fake vote counts, reversed their Yes/No vote on review IQ.While previous studies assumed that vote counts are reflections of perceived review IQs, our results show that this assumption is not always valid.Not only did fake votes reverse the Yes/No decision of some consumers, but they also affected the process of review judgment and product research.Specifically, when votes were manipulated in a positive review of a search or experience good, this affects the rational or logical reasons to buy the product.When votes are manipulated in a negative review on an experience or credence good, this changes a subjective or emotional opinion as to why consumers should not buy the product.These influences depend on consumer profiles because the nature of goods vary and because specific attributes of products are different.Thus, anchoring of fake review IQ votes can influence not only subsequent votes but also the consumers' product research and purchase decision-making.
Our findings provide important insights for future analysis of user-generated content and their evaluations.Previous studies tended to use public-polled data, such as helpfulness rating, as a benchmark or objective inputs.Our study found that anchoring affects these data and discussed how anchoring affects various product categories and consumers' processes of product learning and purchase decision-making.Consumers, retailers, and vendors should take a fresh look at review helpfulness vote counts beyond something trustworthy.

Figure 3 :
Figure 3: Examining anchoring effects with different review IQ indicators

Figure 4 :
Figure 4: Research model and hypotheses

Table 1 :
IQ indicators of reviews

Table 2 :
Review content orientation by SEC category below.

Table 3 :
Characteristics of most helpful consumer reviews used for the study

Table 4 :
Table 4 shows the sample's characteristics.Traditional undergraduate students (age 18-22) made up only 15% of the sample.Profile of respondents

Table 5 :
Anchoring in the YES/NO evaluation of review helpfulness

Table 7 :
Anchoring in learning and decision-making helpfulness

Table 8 :
Summary of hypothesis testing