-
AWS PersonalizeMLAI/RecommendSystem 2022. 7. 7. 17:59
Built-in Recipe (Model)
Sims
Based on collaborative filtering. SIMS identifies the co-occurrence of the item in user histories in your Interaction dataset to recommend similar items. For example, with SIMS Amazon Personalize could recommend coffee shop items customers frequently bought together or movies that different users also watched. Recommended for improved item searchability and faster performance on detail pages.
HPO tunable hyper-parameter
The process of choosing the best value for a hyper-parameter is called hyper-parameter optimization(HPO).
Name Description popularity_discount_factor It affects the balance between popularity and correlation when you calculate similarity. If you calculate similarities to a specific item, a value of 0 makes the most popular items appear as recommendations regardless of their correlation. A value of 1 makes most items that make co-interactions regardless of their popularity. Using either extreme might create an overly long list of recommended items. For most cases, a value around 0.5 works best.
Default: 0.5
Range: [0.0, 1.0]
Value type: Float
HPO tunable: Yesmin_cointeraction_count The minimum number of co-interactions you need to calculate the similarity between a pair of items. For example, a value of 3 means that you need 3 or more users who interacted with both items for the algorithm to calculate their similarity
Default: 3
Range: [0, 10]
Value type: Integer
HPO tunable: YesHPO tunable
Can the parameter participate in hyperparameter optimization (HPO)?
create_solution_response = personalize.create_solution( name = "DEMO-sims-solution-hpo-"+suffix, datasetGroupArn = dataset_group_arn, recipeArn = recipe_arn, performHPO = True, solutionConfig={ 'hpoConfig': { 'hpoResourceConfig': { 'maxNumberOfTrainingJobs': '40', 'maxParallelTrainingJobs': '10' } } } )
Featurization hyper-parameter
Choose an appropriate value after you review user history lengths, using a histogram or similar tool. We recommend setting a value that retains the majority of users but removes the edge cases.
Name Description min_user_history_length_percentile The minimum percentile of user history lengths to include in model training. History length is the total amount of available data on a user. Use min_user_history_length_percentile to exclude a percentage of users with short history lengths. Users with a short history often show patterns based on item popularity instead of the user's personal needs or wants. Removing them can train models with more focus on underlying patterns in your data.
Default: 0.005
Range: [0.0, 1.0]
Value type: Float
HPO Tunable: Nomax_user_history_length_percentile The maximum percentile of user history lengths to include in model training. History length is the total amount of available data on a user. Use max_user_history_length_percentile to exclude a percentage of users with long history lengths. Users with a long history tend to contain noise. For example, a robot might have a long list of automated interactions. Removing these users limits noise in training.
For example, min_hist_length_percentile = 0.05 and max_hist_length_percentile = 0.95 includes all users except ones with history lengths at the bottom or top 5%
Default: 0.995
Range: [0.0, 1.0]
Value type: Float
HPO tunable: Nomin_item_interaction_count_percentile The minimum percentile of item interaction counts to include in model training. Use min_item_interaction_count_percentile to exclude a percentage of items with a short history of interactions. Items with a short history often are new items. Removing them can train models with more focus on items with a known history. max_item_interaction_count_percentile The maximum percentile of item interaction counts to include in model training. Use max_item_interaction_count_percentile to exclude a percentage of items with a long history of interactions. Items with a long history tend to be older and might be out of date. For example, a movie release that is out of print. Removing these items can focus on more relevant items.
For example, min_item_interaction_count_percentile = 0.05 and max_item_interaction_count_percentile = 0.95 includes all items except ones with an interaction count at the bottom or top 5%
Default: 0.9
Range: [0.0, 1.0]
Value type: Float
HPO tunale: NoUser-Personalize
Based on automatic item exploration. It predicts the items that a user will interact with based on Interactions, Items, and Users datasets. With automatic exploration, Amazon Personalize automatically tests different item recommendations, learns from how users interact with these recommended items, and boost recommendations for items that drive better engagement and conversion. You can balance how much to explore (where items with fewer interactions data or relevance are recommended more frequently) against how much to exploit (where recommendations are based on what we know or relevance). Amazon Personalize automatically adjusts future recommendations based on implicit user feedback.
HPO tunable Hyper-parameter
Name Description hidden_dimension The number of hidden variables used in the model. Hidden variables recreate the user's purchase history and item statistics to generate ranking scores. Specify a greater number of hidden dimensions when your interactions dataset includes more complicated patterns. Using more hidden dimensions requires a larger dataset and more time to process. To decide on the best value, use HPO. To use HPO, set performHPO to true when you call CreateSolution and CreateSolutionVersion operations.
Default: 149
Range: [32, 256]
Value type: Integer
HPO tunable: Yesbptt Determines whether to use the back-propagation time technique. Back-propagation through time is a technique that updates weights in recurrent neural network-based algorithms. Use bptt for long-term credits to connect delayed rewards to early events. For example, a delayed reward can be a purchase made after several clicks. An early event can be an initial click. Even within the same event types, such as a click, it's a good idea to consider long-term effects and maximize the total rewards. To consider long-term effects, use larger bptt values. Using a larger bptt value requires larger datasets and more time to process.
Default: 32
Range: [2, 32]
Value type: Integer
HPO tunable: Yesrecency_mask Determines whether the model should consider the latest popularity trends in the interactions dataset. Latest popularity trends might include sudden changes in the underlying patterns of interaction events. To train a model that places more weight on recent events, set recency_mask to true. To train a model that equally weights all past interactions, set recency_mask to false. To get good recommendations using an equal weight, you might need a larger training dataset.
Default: True
Range: True or False
Value type: Boolean
HPO tunable: YesFeaturization hyper-parameter
Name Description min_user_history_length_percentile The minimum percentile of user history lengths to include in model training. max_user_history_length_percentile The maximum percentile of user history lengths to include in model training. Item exploration campaign configuration hyper-parameter
Name Description exploration_weight Determines how frequently recommendations include items with fewer interactions data or relevance. The closer the value is to 1.0, the more exploration. At 0, no exploration occurs and recommendations are based on current data (relevance).
Default: 0.3
Range: [0.0, 1.0]
Value type: Float
HPO tunable: Noexploration_item_age_cut_off Determine items to be explored based on the time frame since the latest interaction. Provide the maximum item age, in days since the latest interaction, to define the scope of item exploration. The larger the value, the more items are considered during exploration.
Default: 30.0
Range: Positive Floats
Value type: Float
HPO tunale: NoDatasets and schemas
Users
Required fields
- USER_ID (string)
- 1 metadata field (categorical string or numerical)
This dataset stores metadata about your users. This might include information such as age, gender, or loyalty membership, which can be important signals in personalization systems
Categorical metadata
With some recipes and both VIDEO_ON_DEMAND and ECOMMERCE domains, Amazon Personalize uses categorical metadata, such as a user's gender or membership status, when identifying underlying patterns that reveal the most relevant items for your users. You define your own range of values based on your use case. Categorical metadata can be in any language.
User metadata schema example
{ "type": "record", "name": "Users", "namespace": "com.amazonaws.personalize.schema", "fields": [ { "name": "USER_ID", "type": "string" }, { "name": "AGE", "type": "int" }, { "name": "GENDER", "type": "string", "categorical": true } ], "version": "1.0" }
Items
This dataset stores metadata about your items. This might include information such as price, SKU type, or availability.
Required Fields
- ITEM_ID (string)
- 1 metadata field (categorical or textual string field or numerical field)
Reserved keywords
- CREATION_TIMESTAMP: For item datasets with a timestamp for each item's creation date, use the CREATION_TIMESTAMP field with a type long. Amazon Personalize uses CREATION_TIMESTAMP data to calculate the age of an item and adjust recommendations accordingly.
Categorical metadata
With certain recipes and domains, Amazon Personalize uses categorical metadata, such as an item's genre or color, when identifying underlying patterns that reveal the most relevant items for your users. You define your own range of values based on your use case. Categorical metadata can be in any language.
Categorical values can have a maximum of 1000 characters. If you have an item with a categorical value of more than 1000 characters, your dataset import job will fail.
For Domain dataset groups, both VIDEO_ON_DEMAND and ECOMMERCE domains use categorical metadata. For Custom dataset groups and custom solutions, recipes that use categorical metadata include the following:
- User-Personalization
- Personalized-Ranking
- Similar-Items
- Item-Affinity
- Item-Attribute-Affinity
Unstructured text metadata
With certain recipes and domains, Amazon Personalize can extract meaningful information from unstructured text metadata, such as product descriptions, product reviews, or movie synopses. Amazon Personalize uses unstructured text to identify relevant items for your users, particularly when items are new or have fewer interactions data. Include unstructured text data in your Items dataset to increase click-through rates and conversation rates for new items in your catalog.
To use unstructured data, add a field with a type string to your Items schema and set the field's textual attribute to true. Then include the text data in your bulk CSV file and incremental item imports. For bulk CSV files, wrap the text in double-quotes. Use the \ character to escape any double quotes or \ characters in your data.
Text can be in the following languages:
- Chinese (Simplified)
- Chinese (Traditional)
- English
- French
- German
- Japanese
- Portuguese
- Spanish
Item metadata schema example
{ "type": "record", "name": "Items", "namespace": "com.amazonaws.personalize.schema", "fields": [ { "name": "ITEM_ID", "type": "string" }, { "name": "GENRES", "type": [ "null", "string" ], "categorical": true }, { "name": "CREATION_TIMESTAMP", "type": "long" }, { "name": "DESCRIPTION", "type": [ "null", "string" ], "textual": true }, ], "version": "1.0" }
Interactions
This dataset stores historical and real-time data from interactions between users and items. In Amazon Personalize, interaction is an event that you record and then import as training data. For both Domain dataset groups and Custom dataset groups, you must at minimum create an Interaction dataset.
Required Fields
- USER_ID (string)
- ITEM_ID (string)
- TIMESTAMP (long)
Reserved keywords
- EVENT_TYPE (string): For Interactions datasets with one or more event types, such as both click and download, use an EVENT_TYPE field. You must define an EVENT_TYPE field as a string and can't be set as categorical.
- EVENT_VALUE (float, null): For interactions datasets that include value data for events, such as the percentage of a video a user watched, use an EVENT_VALUE field with type float and optionally null.
- IMPRESSION (string, null): For interaction datasets with explicit impressions data, use an IMPRESSIONS field with type String and optionally type null. Impressions are lists of items that were visible to a user when they interacted with (for example, clicked or watched) a particular item.
- RECOMMENDATION_ID (string, null): For interactions datasets that user previous recommendations as implicit impressions data, optionally use a RECOMMENDATION_ID field with type String and optionally type null.
Contextual metadata
With certain recipes and recommender use cases, Amazon Personalize can use contextual metadata when identifying underlying patterns that reveal the most relevant items for your users. Contextual metadata is interaction data you collect on the user's environment at the time of an event, such as their location or device type.
Including contextual metadata allows you to provide a more personalized experience for existing users. For example, if customers shop differently when accessing your catalog from a phone compared to a computer, include contextual metadata about the user's device. Recommendations will then be more relevant based on how they are browsing.
Additionally, contextual metadata helps decrease the cold-start phase for new or unidentified users. The cold-start phase refers to the period when your recommendation engine provides less relevant recommendations due to the lack of historical information regarding that user.
For Domain dataset groups, the following recommender use cases can use contextual metadata:
- Recommended for you (ECOMMERCE domain)
- Top picks for you (VIDEO_ON_DEMAND domain)
For Custom dataset groups and custom solutions, recipes that use contextual metadata include the following:
Impressions data
Impressions are lists of items that were visible to a user when they interacted with (for example, clicked or watched) a particular item. Amazon Personalize uses impression data to determine what items to include in exploration. Exploration is where recommendations include new items with fewer interactions data or relevance. The more frequently an item occurs in impressions data, the less likely it is that Amazon Personalize includes the item in exploration.
Implicit impressions
Implicit impressions are the recommendations, retrieved from Amazon Personalize, that you show the user. You can integrate them into your recommendation workflow by including the RecommendationId (returned by the GetRecommendations and GetPersonalizedRanking operations) as input for future PutEvents requests. Amazon Personalize derives the implicit impressions based on your recommendation data.
For example, you might have an application that provides recommendations for streaming video. Your recommendation workflow using implicit impressions might be as follows:
- You request video recommendations for one of your users using the Amazon Personalize GetRecommendations API operation.
- Amazon Personalize generates recommendations for the user using your model (solution version) and returns them with a recommendationId in the API response.
- You show the video recommendations to your user in your application.
- When your user interacts with (for example, clicks) a video, record the choice in a call to the PutEvents API and include the recommendationId as a parameter. For a code, sample see Recording impressions data.
- Amazon Personalize uses the recommendationId to derive the impression data from the previous video recommendations and then uses the impression data to guide exploration, where future recommendations include new videos with fewer interactions data or relevance.
Explicit impressions
Explicit impressions are impressions that you manually record and send to Amazon Personalize. Use explicit impressions to manipulate results from Amazon Personalize. The order of the items has no impact.
For example, you might have a shopping application that provides recommendations for shoes. If you only recommend shoes that are currently in stock, you can specify these items using explicit impressions. Your recommendation workflow using explicit impressions might be as follows:
- You request recommendations for one of your users using the Amazon Personalize GetRecommendations API.
- Amazon Personalize generates recommendations for the user using your model (solution version) and returns them in the API response.
- You show the user only the recommended shoes that are in stock.
- For real-time incremental data import, when your user interacts with (for example, clicks) a pair of shoes, you record the choice in a call to the PutEvents API and list the recommended items that are in stock in the impression parameter. For a code, sample see Recording impressions data.
- For importing impressions in historical interactions data, you can list explicit impressions in your csv file and separate each item with a '|' character. See Formatting explicit impressions.
- Amazon Personalize uses impression data to guide exploration, where future recommendations include new shoes with fewer interactions data or relevance.
Impression feedback Example
import boto3 personalize_events.put_events( trackingId = 'event tracking id', userId= 'user id', sessionId = '1', eventList = [{ 'sentAt': datetime.now().timestamp(), 'eventType' : 'click', 'itemId' : rec_response['itemList'][0]['itemId'], 'recommendationId': rec_response['recommendationId'], 'impression': [item['itemId'] for item in rec_response['itemList']], }] )
Interaction metadata schema example
{ "type": "record", "name": "Interactions", "namespace": "com.amazonaws.personalize.schema", "fields": [ { "name": "USER_ID", "type": "string" }, { "name": "ITEM_ID", "type": "string" }, { "name": "EVENT_TYPE", "type": "string" }, { "name": "EVENT_VALUE", "type": [ "float", "null" ] }, { "name": "LOCATION", "type": "string", "categorical": true }, { "name": "DEVICE", "type": [ "string", "null" ], "categorical": true }, { "name": "TIMESTAMP", "type": "long" }, { "name": "IMPRESSION", "type": "string" } ], "version": "1.0" }
Evaluation of a solution with metrics
You can evaluate the performance of your solution version through offline and online metrics. Online metrics are the empirical results you observe in your users' interactions with real-time recommendations. For example, you might record your users' click-through rate as they browse your catalog. You are responsible for generating and recording any online metrics.
Offline metrics are the metrics Amazon Personalize generates when you train a solution version. You can use offline metrics to evaluate the performance of the model before you create a campaign and provide recommendations. Offline metrics allow you to view the effects of modifying a solution's hyperparameters or compare results from models trained with the same data. For the rest of this section, the term metrics refers to offline metrics.
To get performance metrics, Amazon Personalize splits the input interactions data into a training set and a testing set. The split depends on the type of recipe you choose:
- For USER_SEGMENTATION recipes, the training set consists of 80% of each user's interactions data and the testing set consists of 20% of each user's interactions data.
- For all other recipe types, the training set consists of 90% of your users and their interaction data. The testing set consists of the remaining 10% of users and their interaction data.
Amazon Personalize then creates the solution version using the training set. After training completes, Amazon Personalize gives the new solution version the oldest 90% of each user’s data from the testing set as input. Amazon Personalize then calculates metrics by comparing the recommendations the solution version generates to the actual interactions in the newest 10% of each user’s data from the testing set.
Retrieving Metrics
Coverage
An evaluation metric that tells you the proportion of unique items that Amazon Personalize might recommend using your model out of the total number of unique items in Interactions and Items datasets. To make sure Amazon Personalize recommends more of your items, use a model with a higher coverage score. Recipes that feature item exploration, such as User-Personalization, have higher coverage than those that don’t, such as popularity count.
Mean reciprocal rank at 25
An evaluation metric that assesses the relevance of a model’s highest-ranked recommendation. Amazon Personalize calculates this metric using the average accuracy of the model when ranking the most relevant recommendation out of the top 25 recommendations over all requests for recommendations.
This metric is useful if you're interested in the single highest-ranked recommendation.
For example, searching for Dog and available answers are {dog, dogs, puppy} and results are below:
rank item 1 hotdog 2 dogs 3 puppy 4 cat RR is 1/2 since the highest possible answer ranked at second position.
$$RR=\frac{1}{Rank}$$
Additionally, searching for Cat and available answers are {cat, cats, kitty} then results are below:
rank item 1 car 2 mouse 3 cats 4 kitty RR is 1/3
MRR is the mean RR
$$MRR=\frac{1}{\left| \Omega \right|}\sum_{i=1}^{\left| \Omega \right|}\frac{1}{Rank_{i}}$$
So MRR of examples, $\frac{1}{2}\times(\frac{1}{2}+\frac{1}{3})\cong 0.417$
MRR is simple evaluation, but only care about highest ranked relevant item.
Normalized discounted cumulative gain (NDCG) at K (5/10/25)
Unlike MRR, DCG considers all ranked item in metrics.
DCG assumptions
- Highly relevant documents are more useful, when appearing earlier in a search engine result list (have higher ranks)
- Highly relevant documents are more useful than marginally relevant documents, which are in turn more useful than non-relevant documents.
$$DCG_{p}=\sum_{i=1}^{p}\frac{rel_{i}}{log_{2}(i+1)}$$
Example
When search for Dog, results come out below:
rank item relevance $log_{2}(i+1)$ $\frac{rel_{i}}{log_{2}(i+1)}$ 1 hotdog 0 1 0 2 dogs 2 1.585 1.26 3 puppy 1 2 0.5 4 cat 0 2.322 0 $DCG_{4}=0+1.26+0.5+0=1.76$
An evaluation metric that tells you about the relevance of your model’s highly ranked recommendations, where K is a sample size of 5, 10, or 25 recommendations. Amazon Personalize calculates this by assigning weight to recommendations based on their position in a ranked list, where each recommendation is discounted (given a lower weight) by a factor dependent on its position. The normalized discounted cumulative gain at K assumes that recommendations that are lower on a list are less relevant than recommendations higher on the list.
$weighting \ factor = \frac{1}{log(1+position)}$
Precision at K
An evaluation metric that tells you how relevant your model's recommendations are based on a sample size of K (5, 10, or 25) recommendations. Amazon Personalize calculates this metric based on the number of relevant recommendations out of the top K recommendations, divided by K, where K is 5, 10, or 25.
This metric rewards precise recommendations of the relevant items.
average_rewards_at_k
When you create a solution version (train a model) for a solution with an optimization objective, Amazon Personalize generates an average_rewards_at_k metric. The score for average_rewards_at_k tells you how well the solution version performs in achieving your objective. To calculate this metric, Amazon Personalize calculates the rewards for each user as follows:
rewards_per_user = total rewards from the user's interactions with their top 25 reward-generating recommendations / total rewards from the user's interactions with recommendations
The final average_rewards_at_k is the average of all rewards_per_user normalized to be a decimal value less than or equal to 1 and greater than 0. The closer the value is to 1, the more gains on average per user you can expect from recommendations.
For example, if your objective is to maximize revenue from clicks, Amazon Personalize calculates each user score by dividing total revenue generated by the items the user clicked from their top 25 most expensive recommendations by the revenue from all of the recommended items the user clicked. Amazon Personalize then returns a normalized average of all user scores. The closer the average_rewards_at_k is to 1, the more revenue on average you can expect to gain per user from recommendations.
hit (hit at K)
If you trained the solution version with a USER_SEGMENTATION recipe, the average number of users in the predicted top relevant K results that match the actual users. Actual users are who actually interacted with the items in the test set. K is in the top 1% of the most relevant user. The higher the value the more accurate the predictions.
recall (recall at K)
If you trained the solution version with a USER_SEGMENTATION recipe, the average percentage of predicted users in the predicted top relevant K results that match the actual users. Actual users are the users who actually interacted with the items in the test set. K is the top 1% of the most relevant users. The higher the value, the more accurate the predictions.
Example
precision_at_5
Calculation: 2/5, result: 0.4
Reference
https://docs.aws.amazon.com/personalize/latest/dg/native-recipe-sims.html
https://docs.aws.amazon.com/personalize/latest/dg/native-recipe-new-item-USER_PERSONALIZATION.html
https://ieeexplore.ieee.org/document/1167344
https://arxiv.org/pdf/2007.11808.pdf
https://docs.aws.amazon.com/personalize/latest/dg/how-it-works-dataset-schema.html
https://docs.aws.amazon.com/personalize/latest/dg/working-with-training-metrics.html
https://docs.aws.amazon.com/personalize/latest/dg/optimizing-solution-for-objective.html
https://www.youtube.com/watch?v=givCgC_RIFs
'MLAI > RecommendSystem' 카테고리의 다른 글
Matrix Factorization (1) 2022.07.12 Item-Item Collaborative Filtering (0) 2022.07.12 User-User Collaborative Filtering (0) 2022.07.07 Association Analysis (0) 2022.07.07