Googletest export

Fix Theta(N^2) memory usage of EXPECT_EQ(string) when the strings don't match.

The underlying CalculateOptimalEdits() implementation used a simple
dynamic-programming approach that always used N^2 memory and time. This meant
that tests for equality of large strings were ticking time bombs: They'd work
fine as long as the test passed, but as soon as the strings differed the test
would OOM, which is very hard to debug.
I switched it out for a Dijkstra search, which is still worst-case O(N^2), but
in the usual case of mostly-matching strings, it is much closer to linear.

PiperOrigin-RevId: 210405025
This commit is contained in:
Abseil Team
2018-08-27 14:53:25 -04:00
committed by Gennadiy Civil
parent 1bb76182ca
commit 167c5e8188
5 changed files with 580 additions and 265 deletions

View File

@@ -159,15 +159,15 @@ GTEST_DISABLE_MSC_WARNINGS_POP_() // 4275
#endif // GTEST_HAS_EXCEPTIONS
namespace edit_distance {
// Returns the optimal edits to go from 'left' to 'right'.
// All edits cost the same, with replace having lower priority than
// add/remove.
// Simple implementation of the Wagner-Fischer algorithm.
// See http://en.wikipedia.org/wiki/Wagner-Fischer_algorithm
enum EditType { kMatch, kAdd, kRemove, kReplace };
// add/remove. Returns an approximation of the maximum memory used in
// 'memory_usage' if non-null.
// Uses a Dijkstra search, with a couple of simple bells and whistles added on.
enum EditType { kEditMatch, kEditAdd, kEditRemove, kEditReplace };
GTEST_API_ std::vector<EditType> CalculateOptimalEdits(
const std::vector<size_t>& left, const std::vector<size_t>& right);
const std::vector<size_t>& left, const std::vector<size_t>& right,
size_t* memory_usage = NULL);
// Same as above, but the input is represented as strings.
GTEST_API_ std::vector<EditType> CalculateOptimalEdits(
@@ -179,8 +179,6 @@ GTEST_API_ std::string CreateUnifiedDiff(const std::vector<std::string>& left,
const std::vector<std::string>& right,
size_t context = 2);
} // namespace edit_distance
// Calculate the diff between 'left' and 'right' and return it in unified diff
// format.
// If not null, stores in 'total_line_count' the total number of lines found