# EPI-Sorting [TOC]

## Sorting

### 库函数

``````//sort() 数组排序
//list::sort() 链表排序
``````
• `sort()``list::sort()`都需要实现运算符重载`operator<()`或指定比较函数

• 使用排序使后续的算法更简单
• 使用库函数即可
• 实现自定义的排序要求
• 使用BST, heap,索引数组等.

### Compute the intersection of two sorted arrays

A natural implementation for a search engine is to retrieve documents that match the set of words in a query by maintaining an inverted index. Each page is assigned an integer identifier , its document - ID . An inverted index is a mapping that takes a word w and returns a sorted array of page - ids which contain zv the sort order could be , for example , the page rank in descending order. When a query contains multiple words , the search engine finds the sorted array for each word and then computes the intersection of these arrays these are the pages containing all the words in the query . The most computationally intensive step of doing this is finding the intersection of the sorted arrays .

Write a program which takes as input two sorted arrays , and returns a new array containing elements that are present in both of the input arrays . The input arrays may have duplicate entries , but the returned array should be free of duplicates . For example , the input is ( 2 , 3 , 3 , 5 , 5 , 6 , 7 , 7 , 8 , 12 ) and ( 5 , 5 , 6 , 8, 8, 9, 10 , 10 ) , your output should be ( 5 , 6, 8 ) .

``````vector<int> IntersectTwoSortedArrays(const vector<int>& A,
const vector<int>& B) {
// TODO - you fill in here.
vector<int> ret;
int a_st = 0, b_st = 0;
while (a_st < A.size() && b_st < B.size()) {
if (A[a_st] == B[b_st] && (ret.empty() || ret.back() != A[a_st])) {
ret.emplace_back(A[a_st]);
++a_st, ++b_st;
} else if (A[a_st] < B[b_st]) {
++a_st;
} else {
++b_st;
}
}
return ret;
}
``````

### Merge two sorted arrays

``````void MergeTwoSortedArrays(vector<int>& A, int m, const vector<int>& B, int n) {
// TODO - you fill in here.
int st = m - 1, ed = A.size() - 1;
while (st >= 0) {
A[ed--] = A[st--];
}
//ed 指向空位置
int A_st = ed + 1, B_st = 0, A_curr = 0;
while (A_st < A.size() && B_st < B.size()) {
if (A[A_st] <= B[B_st]) {
A[A_curr++] = A[A_st++];
} else {
A[A_curr++] = B[B_st++];
}
}
while (A_st < A.size()) {
A[A_curr++] = A[A_st++];
}
while (B_st < B.size()) {
A[A_curr++] = B[B_st++];
}
return;
}
``````

``````void MergeTwoSortedArrays(vector<int>& A, int m, const vector<int>& B, int n) {
// TODO - you fill in here.
int A_curr = m + n - 1, A_st = m - 1, B_st = n - 1;
while (A_st >= 0 && B_st >= 0) {
if (A[A_st] >= B[B_st]) {
A[A_curr--] = A[A_st--];
} else {
A[A_curr--] = B[B_st--];
}
}
while (B_st >= 0) {
A[A_curr--] = B[B_st--];
}
return;
}
``````

### Remove first-name duplicates

Design an efficient algorithm for removing all first - name duplicates from an array . For example , if the input is ( ( Ian , Botham ) , ( David , Gower ) , ( Ian , Bell ) , ( Ian , Chappell ) ) , one result could be (( Ian , Bell ), ( David , Gower ) ) ; ( ( David , Gower ), ( Ian , Botham ) ) would also be acceptable .

Hint: Bring equal items close together .

``````void EliminateDuplicate(vector<Name>* names) {
// TODO - you fill in here.
vector<Name>& name_ref = *names;
std::sort(name_ref.begin(), name_ref.end());
int l_idx = 0, r_idx = 1;
while (r_idx < name_ref.size()) {
if (name_ref[r_idx].first_name == name_ref[l_idx].first_name) {
++r_idx;
} else {
name_ref[++l_idx] = name_ref[r_idx++];
}
}
if (l_idx == name_ref.size() - 1) {
return;
}
int residual = name_ref.size() - l_idx - 1;
while (residual-- > 0) {
name_ref.pop_back();
}
return;
}
``````

EPI给了一个用`std::unique`去重的方法. 根据文档的说明, 这是C++17及C++20才支持的功能. 我用的Clion目前只配置了C++11, 就不折腾了.

Eliminates all but the first element from every consecutive group of equivalent elements from the range `[first, last)` and returns a past-the-end iterator for the new logicalend of the range.

Removing is done by shifting the elements in the range in such a way that elements to be erased are overwritten. Relative order of the elements that remain is preserved and the physical size of the container is unchanged. Iterators pointing to an element between the new logical end and the physical end of the range are still dereferenceable, but the elements themselves have unspecified values. A call to `unique` is typically followed by a call to a container's `erase` method, which erases the unspecified values and reduces the physical size of the container to match its new logical size.

### Render a calendar

Consider the problem of designing an online calendaring application . One compo nent of the design is to render the calendar , i .e. , display it visually .

Suppose each day consists of a number of events , where an event is specified as a start time and a finish time. Individual events for a day are to be rendered as nonoverlapping rectangular regions whose sides are parallel to the X - and Y - axes. Let the X - axis correspond to time. If an event starts at time b and ends at time e , the upper and lower sides of its corresponding rectangle must be at b and e , respectively . Figure 14.1 represents a set of events.

Write a program that takes a set of events , and determines the maximum number of events that take place concurrently .

~~我不太知道怎么处理这个问题。~~

``````struct Event {
int start, finish;
};

struct EndPoint {
bool operator<(const EndPoint& e) const{
// 将时间靠前的拍在前面 如果时间相同 将start 排在 end 前面
return this->time != e.time ? this->time < e.time : (this->isStart && !e.isStart);
}
int time;
bool isStart;
};
int FindMaxSimultaneousEvents(const vector<Event>& A) {
// TODO - you fill in here.
vector<EndPoint> E;
for (auto &each : A) {
E.push_back({each.start, true});
E.push_back({each.finish, false});
}
//按照时间排序
std::sort(E.begin(), E.end());

int max_event_cnt = 0, event_cnt = 0;
for (auto &each : E) {
if (each.isStart) {
++event_cnt;
max_event_cnt = std::max(event_cnt, max_event_cnt);
} else {
--event_cnt;
}
}
return max_event_cnt;
}
``````

### Merging intervals

Suppose the time during the day that a person is busy is stored as a set of disjoint time intervals. If an event is added to the person ' s calendar , the set of busy times may need to be updated .

In the abstract , we want a way to add an interval to a set of disjoint intervals and represent the new set as a set of disjoint intervals. For example , if the initial set of intervals is [ - 4 , - 1 ] , [ 0 , 2 ] , [ 3 , 6 ] , [ 7 , 9 ] , [ 11 , 12 ] , [ 14 , 17 ] , and the added interval is [ 1 , 8 ] , the result is [ - 4 , - 1 ] , [ 0 , 9 ] , [ 11 , 12 ] , [ 14 , 17 ] .

Write a program which takes as input an array of disjoint closed intervals with integer endpoints , sorted by increasing order of left endpoint , and an interval to be added , and returns the union of the intervals in the array and the added interval. Your result should be expressed as a union of disjoint intervals sorted by left endpoint .

Hint What is the union of two closed intervals?

• 如果是单个区间，那么遇到一个left后就会遇到right，提取出一个区间
• 如果是多个区间有交叉，对left的个数进行计数，只有left的个数等于right的个数，才算是一个完整的区间。
``````struct Interval {
int left, right;
};
struct EndPoint {
bool operator<(const EndPoint& e) const {
return point != e.point ? point < e.point : (isLeft && !e.isLeft);
}
int point;
bool isLeft;
};
Interval new_interval) {
// TODO - you fill in here.
vector<EndPoint> E;
for (auto &each: disjoint_intervals) {
E.push_back({each.left, true});
E.push_back({each.right, false});
}
E.push_back({new_interval.left, true});
E.push_back({new_interval.right, false});

std::sort(E.begin(), E.end());

vector<Interval> ret;
bool is_new_interval = true;
int st = 0, left_cnt = 0;
for (auto &each : E) {
if (each.isLeft) {
++left_cnt;
if (is_new_interval) {
st = each.point;
is_new_interval = false;
}
} else {
--left_cnt;
}
if (left_cnt == 0) {
ret.push_back({st, each.point});
is_new_interval = true;
}
}
return ret;
}
``````

.... 我这个时间复杂度是\$O(n\log n)\$, 如果输入的区间已经是有序排列, 可以将时间复杂度降低到\$O(n)\$.

### Compute the union of intervals

In this problem we consider sets of intervals with integer endpoints ; the intervals may be open or closed at either end. We want to compute the union of the intervals in such sets.

``````struct Interval {
struct Endpoint {
bool is_closed;
int val;
};

Endpoint left, right;
};

struct Point {
bool operator<(const Point& p) const {
//        return val != p.val ? val < p.val : (is_left && !p.is_left);
if (val != p.val) {
return val < p.val;
}
// 元素相等 两个都是开区间
if (!is_closed && !p.is_closed) {
// 一前一后 将right 放在left前
return !is_left && p.is_left;
}

// 元素相等 一开一闭 都是right 开区间放前
if (!is_left && !p.is_left) {
return !is_closed && p.is_closed;
}
// 元素相等 一开一闭 都是left 闭区间放前
if (is_left && p.is_left) {
return is_closed && !p.is_closed;
}
// 元素相等 一开一闭 left放在right前
return is_left && !p.is_left;
}
int val;
bool is_left;
bool is_closed;
};
vector<Interval> UnionOfIntervals(vector<Interval> intervals) {
// TODO - you fill in here.
vector<Point> P;
for (auto &each : intervals) {
P.push_back({each.left.val, true, each.left.is_closed});
P.push_back({each.right.val, false, each.right.is_closed});
}
std::sort(P.begin(), P.end());
vector<Interval> ret;

bool is_new_intervals = true, left_closed = false;
int left_idx = 0, left_cnt = 0;
for (auto &each : P) {
if (each.is_left) {
++left_cnt;
if (is_new_intervals) {
left_idx = each.val;
left_closed = each.is_closed;
is_new_intervals = false;
}
} else {
--left_cnt;
}
if (left_cnt == 0) {
ret.push_back({{left_closed, left_idx},{each.is_closed, each.val}});
is_new_intervals = true;
}
}
return ret;
}
``````

### Partitioning and sorting an array with many repeated entries

Suppose you need to reorder the elements of a very large array so that equal elements appear together . For example , if the array is ( b , a , c , b , d , a , b , d ) then ( a , a , b , b , b , c , d , d ) is an acceptable reordering , as is ( d, d , c , a,a , b , b , b ) .

If the entries are integers , this reordering can be achieved by sorting the array . If the number of distinct integers is very small relative to the size of the array , an efficient approach to sorting the array is to count the number of occurrences of each distinct integer and write the appropriate number of each integer , in sorted order , to the array . When array entries are objects , with multiple fields , only one of which is to be used as a key , the problem is harder to solve.

You are given an array of student objects . Each student has an integer - valued age field that is to be treated as a key . Rearrange the elements of the array so that students of equal age appear together . The order in which different ages appear is not important . How would your solution change if ages have to appear in sorted order?

Hint Count the number of students for each age .

``````void GroupByAge(vector<Person>* people) {
// TODO - you fill in here
vector<Person>& p_ref = *people;
std::unordered_map<int, vector<string>> age_to_name;
for (auto &each: p_ref) {
age_to_name[each.age].emplace_back(each.name);
}
int c_idx = 0;
for (auto &each : age_to_name) {
for (auto &name: each.second) {
p_ref[c_idx++] = {each.first, name};
}

}
return;
}
``````

``````void GroupByAge(vector<Person>* people) {
// TODO - you fill in here
vector<Person>& p_ref = *people;
std::unordered_map<int, int> age_to_cnt;
for (auto &each: p_ref) {
++age_to_cnt[each.age];
}
std::unordered_map<int, int> age_to_offset;
int offset = 0;
for (auto &each : age_to_cnt) {
age_to_offset[each.first] = offset;
offset += each.second;
}

int c_idx = 0;
while (!age_to_offset.empty()) {
// 将from 的元素替换到to
auto from = age_to_offset.begin();
// 找到from处年龄对应的offset位置的元素的年龄
auto to = age_to_offset.find(p_ref[from->second].age);
// 上一步找到两个年龄的offset, 将from替换到to， 此时to位置的元素处于正确的位置， 而from未必
std::swap(p_ref[from->second], p_ref[to->second]);

--age_to_cnt[to->first];
if (age_to_cnt[to->first] > 0) {
++age_to_offset[to->first];
} else {
age_to_offset.erase(to->first);
}
}
return;
}
``````

### Team photo day---1

You are a photographer for a soccer meet. You will be taking pictures of pairs of opposing teams. All teams have the same number of players . A team photo consists of a front row of players and a back row of players . A player in the back row must be taller than the player in front of him , as illustrated in Figure 14.3. All players in a row must be from the same team.

Design an algorithm that takes as input two teams and the heights of the players in the teams and checks if it is possible to place players to take the photo subject to the placement constraint.

~~这个题不允许排序...虽然可以开新空间复制元素后排序。~~

..... ``````  // Checks if team0 can be placed in front of team1.
static bool ValidPlacementExists(const Team& team0, const Team& team1) {
// TODO - you fill in here.
auto team0_sort(team0.players_);
auto team1_sort(team1.players_);
std::sort(team0_sort.begin(), team0_sort.end());
std::sort(team1_sort.begin(), team1_sort.end());
for (int i = 0; i < team0_sort.size(); ++i) {
if (!(team0_sort[i] < team1_sort[i])) {
return false;
}
}
return true;
}
``````

### Implement a fast sorting algorithm for lists

Implement a routine which sorts lists efficiently . It should be a stable sort , i .e. , the relative positions of equal elements must remain unchanged .

Hint: In what respects are lists superior to arrays ?

``````shared_ptr<ListNode<int>> MergeTwoSortList(shared_ptr<ListNode<int>> L0, shared_ptr<ListNode<int>> L1) {

while (L0 && L1) {
if (L0->data < L1->data) {
ptr->next = L0;
L0 = L0->next;
} else {
ptr->next = L1;
L1 = L1->next;
}
ptr = ptr->next;
}
if (L0 == nullptr) {
ptr->next = L1;
} else {
ptr->next = L0;
}
}

shared_ptr<ListNode<int>> StableSortList(shared_ptr<ListNode<int>> L) {
// TODO - you fill in here.
if (L == nullptr || L->next == nullptr) {
return L;
}
// 用快慢指针找到链表中点
shared_ptr<ListNode<int>> pre_slow = nullptr, slow = L, fast = L;
while (fast && fast->next) {
pre_slow = slow;
fast = fast->next->next;
slow = slow->next;
}
// 断开链表
pre_slow->next = nullptr;
// 递归归并排序
return MergeTwoSortList(StableSortList(L), StableSortList(slow));
}
``````

### Compute a salary threshold

You are working in the finance office for ABC corporation . ABC needs to cut payroll expenses to a specified target . The chief executive officer wants to do this by putting a cap on last year ' s salaries. Every employee who earned more than the cap last year will be paid the cap this year ; employees who earned no more than the cap will see no change in their salary .

For example , if there were five employees with salaries last year were \\$ 90 ,\\$ 30 , \\$ 100 , \\$ 40 , and \\$ 20 , and the target payroll this year is \\$ 210 , then 60 is a suitable salary cap , since 60 + 30 + 60 + 40 + 20 = 210.

``````double FindSalaryCap(int target_payroll, vector<int> current_salaries) {
// TODO - you fill in here.
std::sort(current_salaries.begin(), current_salaries.end());
// 当前cap下的薪水和