Create a guide for authoring in C++ #208

hobovsky · 2020-12-30T12:02:41Z

Create an article similar to https://docs.codewars.com/languages/python/authoring/ but for C++.

Points needing particular attention:

includes (missing includes, C includes)
random utilities
stringizers
custom assertion messages (modifications to the testing framework: Add overloads to show additional custom message on failure snowhouse#3 )
compilation warnings
input and output values: const ref vs value, replacements for arrays and std::vector
avoid C: strings, arrays
input modification, changing the signature of solution function

The text was updated successfully, but these errors were encountered:

hobovsky · 2021-01-16T13:13:27Z

@kazk could you share some insight on how solutions in C++ are built? What gets concatenated with what, what exactly is in the template (some includes? main?), etc?

error256 · 2021-01-16T15:47:23Z

From codewars/runner#35 (comment)

#include <igloo/igloo_alt.h>
#include <igloo/CodewarsTestListener.h>

using namespace igloo;

// #include "preloaded.h" if preloaded

// [solution]

// [tests]

int main(int, const char *[]) {
  NullTestResultsOutput output;
  TestRunner runner(output);
  CodewarsTestListener listener;
  runner.AddListener(&listener);
  runner.Run();
}

hobovsky · 2021-01-16T18:00:50Z

Would it still work if it looked like this:

// #include "preloaded.h" if preloaded

// [solution]

#include <igloo/igloo_alt.h>
using namespace igloo;

// [tests]

#include <igloo/CodewarsTestListener.h>

int main(int, const char *[]) {
  NullTestResultsOutput output;
  TestRunner runner(output);
  CodewarsTestListener listener;
  runner.AddListener(&listener);
  runner.Run();
}

Not that I am proposing such change at the moment, I am just trying to figure out dependencies between snippets and what can potentially pollute what etc.

wtlgo · 2021-04-27T11:19:16Z

@hobovsky asked me to share my thoughts about the proper usage of the <random> library.
So, here they are

The C-way of generating random data rand() % (a - b) + b should be avoided unless it's really necessary. It's outdated.
PRNG should be defined once per Describe unless, of course, it's necessary to have several PRNGs.
std::random_device should not be used as the main source of random data. but, as intended by the standard, as a seeder for a proper PRNG (std::mt19937, std::minstd_rand0, std::ranlux48_base, etc.). e.g. The common way of doing this is std::mt19937 engine{ std::random_device{}() };. Worth noticing, that it's better to keep the std::random_device instance if you're going to create more than one PRNG per Describe.
PRNG should not be seeded with time, it's a bad practice.
If the distribution instance does not change between tests, it should probably be defined once in the Describe body, as same as PRNG instance.
std::shuffle should be used to shuffle a container.
std::shuffle should not be used to pick random elements from the container. You must use std::sample instead.

Some examples from me (which I actually saw on CW)

// The tested function
int f(int a, int b = 0) {
    return a + b;
}

Describe(Bad) {
    It(c_style_rand) {
        srand(time(NULL));
        for(int i = 0; i < 100; ++i) {
             const int n = rand() % (100 - 1) + 1; // Please, let it go
             Assert::That(f(n), Equals(n)); 
        }
    }
};

Describe(Bad) {
private:
    int rand1_100() {
        std::mt19937 engine{ std::random_device{}() };
        return std::uniform_int_distribution<int>{ 1, 100 }(engine);
    }

public:
    It(create_pnrg_multiple_times) {
        for(int i = 0; i < 100; ++i) {
             const int n = rand1_100(); // PRNG is recreated 100 times.
             Assert::That(f(n), Equals(n)); 
        }
    }
};

Describe(Bad) {
private:
    std::mt19937 engine{ std::random_device{}() };

    int rand(const int a, const int b) {
        return std::uniform_int_distribution<int>{ a, b }(engine);
    }

public:
    It(create_distribution_multiple_times) {
        for(int i = 0; i < 100; ++i) {
             const int n = rand(1, 100); // Distribution range doesn't really change between cases but is recreated 100 times
             Assert::That(f(n), Equals(n)); 
        }
    }
};

Describe(Bad) {
private:
    std::random_device engine; // Shorter to type, but not reliable
    std::uniform_int_distribution<int> dist{ 1, 100 };

public:
    It(random_device_as_main_rng) {
        for(int i = 0; i < 100; ++i) {
             const int n = dist(engine);
             Assert::That(f(n), Equals(n)); 
        }
    }
};

Describe(Bad) {
private:
    std::mt19937 engine{ time(NULL) }; // It's a bad practice
    std::uniform_int_distribution<int> dist{ 1, 100 };

public:
    It(time_as_seed) {
        for(int i = 0; i < 100; ++i) {
             const int n = dist(engine);
             Assert::That(f(n), Equals(n)); 
        }
    }
};

Describe(Bad) {
private:
    std::mt19937 engine{ chrono::system_clock::now().time_since_epoch().count() }; // As bad as previous, but harder to read
    std::uniform_int_distribution<int> dist{ 1, 100 };

public:
    It(time_as_seed) {
        for(int i = 0; i < 100; ++i) {
             const int n = dist(engine);
             Assert::That(f(n), Equals(n)); 
        }
    }
};

Describe(Bad) {
private:
    std::mt19937 engine{ std::random_device{}() };

public:
    It(random_shuffle) {
        std::array<int, 100> arr;
        std::generate_n(arr.begin(), 100, [n = 0]() mutable { return ++n; });
       
        // Inventing the wheel.
        for(int i = 0; i < 100; ++i) {
            std::uniform_int_distribution<int> dist{ i, 99 };
            std::swap(arr[i], arr[dist(engine)]);
        }

        for(int i = 0; i < 100; ++i) {
             const int n = arr[i];
             Assert::That(f(n), Equals(n)); 
        }
    }
};

Describe(Bad) {
private:
    std::mt19937 engine{ std::random_device{}() };

public:
    It(pick_elements_with_shuffle) {
        std::array<int, 100> arr;
        std::generate_n(arr.begin(), 100, [n = 0]() mutable { return ++n; });

        for(int i = 0; i < 100; ++i) {
             std::shuffle(arr.begin(), arr.end()); // Performs a ton of job for just getting two numbers
             Assert::That(f(arr[0], arr[1]), Equals(arr[0] + arr[1])); 
        }
    }
};

And how it (in my opinion) should be

Describe(Good) {
private:
    std::mt19937 engine{ std::random_device{}() };
    std::uniform_int_distribution<int> dist{ 1, 100 };

public:
    It(random_numbers) {
        for(int i = 0; i < 100; ++i) {
             const int n = dist(engine);
             Assert::That(f(n), Equals(n)); 
        }
    }

    It(random_sequence) {
        std::array<int, 100> arr;
        std::generate_n(arr.begin(), 100, [n = 0]() mutable { return ++n; });
       
        std::shuffle(arr.begin(), arr.end(), engine);

        for(int i = 0; i < 100; ++i) {
             const int n = arr[i];
             Assert::That(f(n), Equals(n)); 
        }
    }

    It(random_pick) {
        std::array<int, 100> arr;
        std::generate_n(arr.begin(), 100, [n = 0]() mutable { return ++n; });
       
        int n[2];
        for(int i = 0; i < 100; ++i) {
             std::sample(arr.cbegin(), arr.cend(), n, 2, engine);
             Assert::That(f(n[0], n[1]), Equals(n[0] + n[1])); 
        }
    }
};

Also, not part of the main thing, but maybe a bit of useful advice. To avoid writing (engine) every single time, I usually do this

Describe(WhoKnows) {
private:
     std::mt19937 engine{ std::random_device{}() };
     std::function<int()> rand_number = std::bind(std::uniform_int_distribution<int>{ 1, 100 }, engine);
     std::function<size_t()> rand_length = std::bind(std::uniform_int_distribution<size_t>{ 20, 100 }, engine);
     std::function<double()> rand_coefficient = std::bind(std::uniform_real_distribution<double>{ -1, 1 }, engine);
     std::function<char()> rand_letter = std::bind(std::uniform_int_distribution<char>{ 'a', 'z' }, engine);
     std::function<bool()> rand_bool = std::bind(std::uniform_int_distribution<char>{ 0, 1 }, engine);
     // and etc.
};

After you can call them directly by rand_number() or rand_bool() not thinking about the engine at all. I find it comfortable.

That's it. I hope, I didn't miss anything.

error256 · 2021-04-29T17:46:14Z

Random tests don't need ideal distributions, thread safety or anything like that, so the recommendations above are more like what examples in the guide probably should use, but I don't see anything bad if someone likes to use rand as long as it's correct and doesn't make code overcomplicated.
Speaking of best practices outside of CW, isn't function outdated for most of its possible use cases? If I understand it correctly, it doesn't have type info, so it can't be inlined of optimized otherwise, it doesn't know the size of bound variables, so it has to allocate memory dynamically...

hobovsky · 2021-04-29T18:13:08Z

I am not bothered at all by quality of random data generated with rand. It's perfectly sufficient for use cases of kata.
Main reason why I think it would be worthwhile to discourage rand is because authors tend to get it terribly wrong when it comes to anything more complex than rand() % 100. Seeding is done improperly, bounds of ranges are off, randomization of values larger than RAND_MAX or of other types (unsigned, floating point) take terribly twisted (and error-prone) forms.

When reviewing, I would be perfectly fine with correctly used rand. I am not sure if it's worth promoting though, and how to put it in the docs. "You can use rand as long as you do this correctly." Duh! Of course I do! (proceeds to generate digits of a string to parse it to unsigned long long value).

Voileexperiments · 2021-04-30T08:43:48Z

From the standpoint of a less-honest user srand is highly exploitable so rand definitely should be avoided in C++ (since it's easy to do this). In C I don't think a universal alternative exist (/dev/urandom aren't available in Windows, etc).

hobovsky added documentation Improvements or additions to documentation kind/recipe New Recipe lancuage/cpp Articles related to C++ labels Dec 30, 2020

Steffan153 added kind/tutorial New Tutorial and removed kind/recipe New Recipe labels Dec 30, 2020

hobovsky mentioned this issue Apr 27, 2021

C++ authoring tutorial #311

Merged

hobovsky mentioned this issue Apr 30, 2021

Replace rand with std::random_device in C++ codewars/content-issues#33

Open

hobovsky linked a pull request Apr 30, 2021 that will close this issue

C++ authoring tutorial #311

Merged

hobovsky closed this as completed in #311 May 3, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Create a guide for authoring in C++ #208

Create a guide for authoring in C++ #208

hobovsky commented Dec 30, 2020 •

edited

Loading

hobovsky commented Jan 16, 2021

error256 commented Jan 16, 2021

hobovsky commented Jan 16, 2021 •

edited

Loading

wtlgo commented Apr 27, 2021 •

edited

Loading

error256 commented Apr 29, 2021

hobovsky commented Apr 29, 2021 •

edited

Loading

Voileexperiments commented Apr 30, 2021

Create a guide for authoring in C++ #208

Create a guide for authoring in C++ #208

Comments

hobovsky commented Dec 30, 2020 • edited Loading

hobovsky commented Jan 16, 2021

error256 commented Jan 16, 2021

hobovsky commented Jan 16, 2021 • edited Loading

wtlgo commented Apr 27, 2021 • edited Loading

error256 commented Apr 29, 2021

hobovsky commented Apr 29, 2021 • edited Loading

Voileexperiments commented Apr 30, 2021

hobovsky commented Dec 30, 2020 •

edited

Loading

hobovsky commented Jan 16, 2021 •

edited

Loading

wtlgo commented Apr 27, 2021 •

edited

Loading

hobovsky commented Apr 29, 2021 •

edited

Loading