Remove UNIMPORTANT Stop-words
Please click on this link to commence the process https://pla.delaconcorp.com/siteui/samconfig/stopwords_config and login to your Delacon Portal.
Words such as "am", "is", "are" etc. generally occur very frequently in almost every context of conversation and carry very low significance in providing meaningful insight about the context. These words are not important and by convention, called Stop-words.
Besides these common stop-words, we also identify the top 100 words occurring frequently in your business conversations. These words are the potential stop-words in your business domain.
If any of these words carries significant information for your insight, please remove that word from the auto-generated list of domain-specific stop-words.
To remove a word from this list, please follow these instructions:
- First, select which profile you will be using. Profiles refer to which aspect of the speech you are analysing. One profile might be looking at "types of product", while another might be looking at "call type", for example. These profiles will be created by Delacon during your initial set up.
- Hover over the word you want to delete - it will display the delete button (x).
- To delete the word, click on the x button which will cross-out the word. At this stage, the word is not deleted from the database. The cross-out here means that the word is selected for deletion, but database is not updated yet. To update the database, go to Step 4.
Please keep in mind, that this word is important for your insight and that is why, you want to remove it from this list of unimportant stop-words. - If you find any word that is found very frequently in the list of top keywords but does not contribute any meaningful information to your insight, you can add such words in this list of un-wanted stop-words. In order to add any stop-word, enter it in the text box and press Add.
The word will be added and displayed in the list. However, it is not stored in the database yet. To update the database, go to Step 4. - Once you review the whole list and select the words that you want to remove from this domain-specific stop-words, click on the Save
This will update the database for domain-specific stop-words. It is recommended to save the modification in a reasonable intervals to avoid logging out from the system because of inactivity.
Keyword Priority Configuration
Please click on this link to commence the process
https://pla.delaconcorp.com/siteui/samconfig/term_and_weight_config
You must already be logged into your Delacon portal.
Nobody knows your business better than you and your expert knowledge is a valuable resource in developing insights about the business. This feature of Speech Analytics allows us to transfer your expert knowledge into the system and thus plays a significant role in identifying important key-words and phrases and thus tag each call to one or multiple pre-set categories as appropriate.
Please include here any important "Term" you want to be included in the Speech Analysis, including the "Weight" and the call "Category" it belongs to.
The Term can be any word or phrase. Each term here explains/defines a Category. Please note, the same term can define multiple categories. For example, the term “credit card” can be related to both “sales” and “billing” categories.
The Weight can range from -5 to +5. The higher/more positive the term weight is, the higher its importance in that call Category. For example, “credit card” is often more related to “billing” category compared to “sales”. Hence, for “billing”, this term can be assigned a weight of 5, whereas for sales the weight can be 3.
To add a term in this list, please follow these instructions:
1. Click Add This will add a new row at the top of the table.
2. Enter the Term, select Weight and Category for this term. Select the Matching algorithm – Exact for exact matching or Intelligent for context-based matching. Then click on Confirm to add it to the list. Please note, at this point the term is only added to this list - it is not saved in the database.
3. To Edit or Remove the added term, click on the corresponding buttons. Similar to before, the operation is performed locally only; database isn’t update yet.
4. Finally, once keyword priority data input is complete, click the Submit button to save all the data in the database.
You can filter the list for specific keyword/phrase, weight, category, or matching algorithm by entering input in the search textbox above the table.
You can click the Submit button any time to save the already entered data in the database. It is recommended to save the modification in reasonable interval to avoid logging out from the system because of inactivity.
2.1 Guideline for Term and Weight selection
In this section, a guideline is provided on how to select words/phrases representing a specific category for the Keyword Priority Configuration. This section also discusses on the strategy of assigning weights to those selected words/phrases. For simplicity, here we’ll call the words or phrases as terms.
2.1.1 Mapping of Business Needs with the Categories
The first step of identifying the terms and weights is to select the categories based on which each call will be categorised or tagged. The list of categories needs to be selected in align with the business needs.
It is advisable to do the mapping of business needs with the categories at the very beginning. The modification of categories and related term-weights would be very challenging at later stages.
2.1.2 Identify Call Category
The recommended next step is to listen to few call conversation recordings to identify the category of those calls.
Please note that the number of recordings that is required to be listened to greatly depends on the nature of business and also the category. However, if no new words/terms can be found for any specific category from 4/5 new recorded conversations (of the same category), it is safe to assume that sufficient words/terms are found for that category.
2.1.3 Selecting Words/Terms
1. Now identify terms that represents that specific category from the conversations and transcripts available in Delacon Portal. For example, “comparison rate”, “credit risk” can related to the “finance” category for any specific business.
2. It is important to compare the selected word/term both in the audio conversation and transcribed text to identify the usual form of any term that the machine learning based speech-to-text conversion engine provides.
For example,
- “1300” is found to be transcribed as “one 300” by the transcription service used in Delacon system.
- “b-pay” can be transcribed as “b-pay”, “bpay”, or “b pay”.
- “price”, “prices”, “pricing” any of these can be used in the conversation.
- Consider singular and plural form of the word. Both “transaction” and “transactions” should be included with same weight
3. Consider all possible forms of a term can be used in the conversation. For the first two cases, treat each form as separate term, whereas intelligent matching algorithm can identify different variants of the same term.
4. A selected term can represent multiple categories. For such term, it is preferable to have different weights for different categories based on its importance in those categories. For example, paperwork (or paper work) can have higher weight for the category “finance”, compared to “service”.
5. A single word could be very general in meaning and hence might be treated as “category representing word”. However, combination of words with previous and/or following word(s) (i.e. phrases) might describe the category effectively. For example, in a car dealership business, “drop” does not exhibit any relation with the “service” category, but “drop off” is often used for “dropping the car off for servicing”.
6. If the client sells specifics product or services, then add products that are related to the category. For example, mis-pronouncing the names of car models can relate back to the word the customer did actually mean and direct them to “sales” or “service” departments, but “mirror cover”, “coolant” are more related to “parts” or “service” departments.
7. When any automatically identified stop-words are removed, it is stored in “KEYWORD PRIORITY CONFIGURATION” table with a default weight of “-2” and category “general”. The weights and category of some words may be later modified to optimise speech analysis performance.
8. Terms that do not represent any category, but important in providing insights, such as any campaign related terms, can be stored under “general” category and with higher weights. Such configuration ensures these words’ presence in the top of the important keyword list. For example, “call back” can be important information about a call.
2.1.4 Assign Weights
The range of weight is from -5 to +5 including 0. Any positive weight promotes the position of the term in the final list of top keywords (5 is maximum) and negative weight does the opposite. The weight 0 does not change the candidate term’s position in the list; however, it can contribute to the tagging score. For example, if the word “service” occurs a lot in service-related conversations, then assigning a positive weight will move this word to the top of the list. It might be undesirable in some cases, considering the fact that the word service can be used in different contexts, not only in “service” category related conversations. In this scenario, the word “service” can be assigned with a weight 0 (zero) for the category “service”. The term’s ranking won’t be modified in this case, but if it is in the top 20% of the ranked keywords, it’ll be considered while computing confidence score of the service category.
Ranking Score Modification of a Term with Weights:
The score of each word is modified with the weight information provided priority configuration. The whole range of score is divided into five equal sections and the score of a term is modified linearly in accordance with the input weight of that term.
As shown in this figure, the whole range is divided into five groups each containing 20% of the words in it. Based on the weight of a word/term, the ranking of that word shifts upwards or downwards in this list. For example, the score of a word “ombudsman” is very low and hence is ranked (using mathematical philosophy) in the lowest 20% of the list (red). However, the weight for “ombudsman” is 4 which means the position of the word will move upwards (positive weight) by 4 groups. Therefore, the final position of this word will be in the top 20% (blue). However, a weight of 5 will shift its ranking to the top of the list.
The range of weight is from -5 to + 5, including 0. All the positive weight indicates upward movement, negative values lowers the position/ranking of the word, and zero (0) means no changes in position.
- If the word/term represents or describes a specific category, then the corresponding weight should be very high (5 or 4). For example, in a car dealership business, “drive away” or “order number” is highly likely be used in only “sales” related conversations and so for “sales” category “drive away” or “order number” should get a weight of 4 or 5. On the other hand, “credit risk” is very likely to occur in only in “finance” related conversations and so the weight for “credit risk” should be 4/5 for “finance” category.
- If the term is also used in other contexts (categories), then the weight should be low. For example, "cancel" can be used in different contexts, such as cancel the service, cancel the order, noise cancellation etc. But “cancel my booking” generally indicates information related to the category “Booking”. So the weight for “cancel” (weight = 2) should be lower than that of “cancel my booking” (weight = 5).
- Some terms may be used in different categories related conversations. Depending their importance in representing/describing each category, the weight should be assigned. For example, the word “cost” can be used in different departments such as “sales”, “spare parts”, and “service” in a car dealership business. However, its significance/importance in the “sales” category is most, then in “spare parts” and finally in “service”. The significance can also be described as “more relevant” to the category. Therefore, for the word “cost”, in “sales” weight = 4, in “spare parts” weight = 3, and in “service” weight = 2. Another example is the use of the word “rego” in “service” (weight = 3), “spare parts” (weight = 2), and “sales” (weight = 1).
-
When any automatically identified stop-words are removed, it is stored in “KEYWORD PRIORITY CONFIGURATION” table with a default weight of “-2” and category “general”. The weights and category of some words may be later modified to optimise speech analysis performance.
- It is to be noted that as these words are stop-words, their frequency will be comparatively high. Assigning a weight of 0 to any specific stop-word should be enough to ensure that this word will be present in the top of the final keywords list. For example, if the stop-word “key” is important for any customer, then change weight from -2 to 0 for the word “key”.
- If a weight of -2 cannot restrict the stop-word from being listed in the top list of keywords, then the weight can be reduced to -3 or so on.
- Terms that do not represent any category, but important in providing insights, such as any campaign related terms, can be stored under “general” category and with higher weights. Such configuration ensures these words’ presence in the top of the important keyword list. For example, “call back” can be important information about a call and so category = general and weight = 5.
2.1.5 Select Matching Algorithm
Select the matching algorithm for each term. The options are: Exact and Intelligent. As the name suggests, to identify a term in exact form in the transcript, select Exact as matching algorithm.
The Intelligent matching algorithm can be used to identify a topic in the transcript. For example, for the input topic “refund of course fee”, the speech analytics can identify:
- I want a refund of my course fee
- I want a refund of my courses fee
- I want a refund of my course fees
- I want my course fee refunded
- Refund my course fee(s) immediately
- Do you want a refund of your course fee(s)?
- We don’t refund the course fee(s)
- Is the course fee(s) refundable?
- Is it refundable, the course fee(s)?
- The refundability of course fee(s) is not certain.
Additional Important Notes:
- It is to be noted that only the future call transcription analysis will be affected once Keyword Priority Configuration is updated. To re-analyse the previous calls, please request Delacon support for re-analysis.
- Input words/terms and category combination in “2 KEYWORD PRIORITY CONFIGURATION” must be unique.
- The repetition of the same word in conversation is very common but is rare in important terms or phrases representing the category. Therefore, phrases with same word, e.g. ‘campaign campaign’ will not be treated as a term and hence one of these will be removed. As a result, ‘campaign campaign price’ will be treated as ‘campaign price’ where the first ‘campaign’ will be removed from the analysis.
- A word/term belonging to multiple categories will contribute to the tagging score of each category based on that word’s weight corresponding to that category. If the call is related to “sales”, then other words as well as this word’s contribution will generate highest tagging score for “sales”.
- It is advisable to get assistance from the call centre agents, who actually handle the calls, in identifying category related terms and their weights (i.e. importance).
To learn more about the Call Categoriser, go here.
To see how Speech Analytics data can be analysed, please read our article on Call Visualisation and Transcription.
To learn more about Keyword Spotting, go here.
Comments
0 comments
Please sign in to leave a comment.