Alien Dictionary
I share my learnings here. Thanks for reading.
Problem
A new alien language uses the English alphabet, but the order of letters is unknown. You are given a list of words[] from the alien language’s dictionary, where the words are claimed to be sorted lexicographically according to the language’s rules. (Geeks)
Your task is to determine the correct order of letters in this alien language based on the given words. If the order is valid, return a string containing the unique letters in lexicographically increasing order as per the new language's rules. If there are multiple valid orders, return any one of them.
However, if the given arrangement of words is inconsistent with any possible letter ordering, return an empty string ("").
A string a is lexicographically smaller than a string b if, at the first position where they differ, the character in a appears earlier in the alien language than the corresponding character in b. If all characters in the shorter word match the beginning of the longer word, the shorter word is considered smaller.
Note: Your implementation will be tested using a driver code. It will print true if your returned order correctly follows the alien language’s lexicographic rules; otherwise, it will print false.
Examples:
Input: words[] = ["baa", "abcd", "abca", "cab", "cad"]
Output: true
Explanation: A possible corrct order of letters in the alien dictionary is "bdac".
The pair "baa" and "abcd" suggests 'b' appears before 'a' in the alien dictionary.
The pair "abcd" and "abca" suggests 'd' appears before 'a' in the alien dictionary.
The pair "abca" and "cab" suggests 'a' appears before 'c' in the alien dictionary.
The pair "cab" and "cad" suggests 'b' appears before 'd' in the alien dictionary.
So, 'b' → 'd' → 'a' → 'c' is a valid ordering.
Input: words[] = ["caa", "aaa", "aab"]
Output: true
Explanation: A possible corrct order of letters in the alien dictionary is "cab".
The pair "caa" and "aaa" suggests 'c' appears before 'a'.
The pair "aaa" and "aab" suggests 'a' appear before 'b' in the alien dictionary.
So, 'c' → 'a' → 'b' is a valid ordering.
Input: words[] = ["ab", "cd", "ef", "ad"]
Output: ""
Explanation: No valid ordering of letters is possible.
The pair "ab" and "ef" suggests "a" appears before "e".
The pair "ef" and "ad" suggests "e" appears before "a", which contradicts the ordering rules.
Constraints:
1 ≤ words.length ≤ 500
1 ≤ words[i].length ≤ 100
words[i] consists only of lowercase English letters.
Solution
Idea: The main idea is that for each pair in the list, we can determine the order of two letters. For example, in "zb" and "ax," the sequence of letters will be [z, a], meaning z comes before a. We don't need to compare each word with all the others; we can simply compare it with the next word. This gives us one order, and from there, we can compare adjacent words to find the sequence. It won't give all possible orders, but it provides enough information to predict the full sequence.
This is an application of the Topological Sort algorithm because each pair can be represented as a directed edge and each character as a vertex.
Time Complexity: O(n) + O(26)
Space Complexity: O(26)
Overall
Time Complexity: O(n)
Space Complexity: O(1)
class Solution {
public String findOrder(String[] words) {
Set<List<Integer>> edgeSet = new HashSet<>();
Map<Character, Integer> charMap = new HashMap<>();
Map<Integer, Character> numMap = new HashMap<>();
createMaps(words,charMap, numMap);
boolean isCreated = createEdgeList(words, edgeSet, charMap);
if(!isCreated) return "";
int[][] edges = getEdges(edgeSet);
int V = charMap.size();
List<Integer> topoSortOrder = topoSort(V, edges);
return topoSortOrder.size()==V ? getOrder(topoSortOrder, numMap) : "";
}
private String getOrder(List<Integer> topoSortOrder, Map<Integer, Character> numMap){
String order = "";
for(int vertex : topoSortOrder){
order+=numMap.get(vertex);
}
return order;
}
private int[][] getEdges(Set<List<Integer>> edgeSet){
int[][] edges = new int[edgeSet.size()][2];
int i = 0;
for(List<Integer> edge : edgeSet){
edges[i][0] = edge.get(0);
edges[i][1] = edge.get(1);
i+=1;
}
return edges;
}
private boolean createEdgeList(String[] words, Set<List<Integer>> edgeSet, Map<Character, Integer> charMap){
int counter = 0;
for(int i=0; i<words.length-1; i++){
String word1 = words[i];
String word2 = words[i+1];
boolean isBroken = false;
for(int j=0; j<word1.length() && j<word2.length(); j++){
char charX = word1.charAt(j);
char charY = word2.charAt(j);
if(charX != charY){
List<Integer> edge = Arrays.asList(charMap.get(charX), charMap.get(charY));
edgeSet.add(edge);
isBroken = true;
break;
}
}
if(!isBroken) {
if(word1.length()>word2.length())
return false; // invalid dictionary
}
}
return true;
}
private void createMaps(String[] words, Map<Character, Integer> charMap, Map<Integer, Character> numMap){
int counter = 0;
for(String word : words){
for(int i=0; i<word.length(); i++){
char c = word.charAt(i);
if(!charMap.containsKey(c)){
charMap.put(c, counter);
numMap.put(counter, c);
counter+=1;
}
}
}
}
// TOPOLOGICAL SORTING
public static ArrayList<Integer> topoSort(int V, int[][] edges) {
List<List<Integer>> adjList = createAdjList(V, edges);
int[] visited = new int[V];
int[] indegrees = createIndegrees(V, edges);
Queue<Integer> queue = initializeQueueWith0IndegreeVertices(indegrees);
ArrayList<Integer> order = new ArrayList<>();
while(!queue.isEmpty()){
int vertex = queue.poll();
order.add(vertex);
for(int node : adjList.get(vertex)){
indegrees[node]-=1;
if(indegrees[node]==0) queue.add(node);
}
}
return order;
}
private static Queue<Integer> initializeQueueWith0IndegreeVertices(int[] indegrees) {
Queue<Integer> queue = new ArrayDeque<>();
for(int i=0; i<indegrees.length; i++){
if(indegrees[i]==0){
queue.add(i);
}
}
return queue;
}
private static int[] createIndegrees(int V, int[][] edges) {
int[] indegrees = new int[V];
for(int[] edge : edges){
indegrees[edge[1]]+=1;
}
return indegrees;
}
private static List<List<Integer>> createAdjList(int V, int[][] edges) {
List<List<Integer>> adjList = new ArrayList<>();
for(int i=0; i<V; i++){
adjList.add(new ArrayList<>());
}
for(int[] edge : edges){
adjList.get(edge[0]).add(edge[1]);
}
return adjList;
}
}