Thanks to visit codestin.com
Credit goes to github.com

Skip to content

Create a guide for authoring in C++ #208

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
hobovsky opened this issue Dec 30, 2020 · 7 comments · Fixed by #311
Closed

Create a guide for authoring in C++ #208

hobovsky opened this issue Dec 30, 2020 · 7 comments · Fixed by #311
Labels
documentation Improvements or additions to documentation kind/tutorial New Tutorial lancuage/cpp Articles related to C++

Comments

@hobovsky
Copy link
Contributor

hobovsky commented Dec 30, 2020

Create an article similar to https://docs.codewars.com/languages/python/authoring/ but for C++.

Points needing particular attention:

  • includes (missing includes, C includes)
  • random utilities
  • stringizers
  • custom assertion messages (modifications to the testing framework: Add overloads to show additional custom message on failure snowhouse#3 )
  • compilation warnings
  • input and output values: const ref vs value, replacements for arrays and std::vector
  • avoid C: strings, arrays
  • input modification, changing the signature of solution function
@hobovsky hobovsky added documentation Improvements or additions to documentation kind/recipe New Recipe lancuage/cpp Articles related to C++ labels Dec 30, 2020
@Steffan153 Steffan153 added kind/tutorial New Tutorial and removed kind/recipe New Recipe labels Dec 30, 2020
@hobovsky
Copy link
Contributor Author

@kazk could you share some insight on how solutions in C++ are built? What gets concatenated with what, what exactly is in the template (some includes? main?), etc?

@error256
Copy link
Contributor

From codewars/runner#35 (comment)

#include <igloo/igloo_alt.h>
#include <igloo/CodewarsTestListener.h>

using namespace igloo;

// #include "preloaded.h" if preloaded

// [solution]

// [tests]

int main(int, const char *[]) {
  NullTestResultsOutput output;
  TestRunner runner(output);
  CodewarsTestListener listener;
  runner.AddListener(&listener);
  runner.Run();
}

@hobovsky
Copy link
Contributor Author

hobovsky commented Jan 16, 2021

Would it still work if it looked like this:

// #include "preloaded.h" if preloaded

// [solution]

#include <igloo/igloo_alt.h>
using namespace igloo;

// [tests]

#include <igloo/CodewarsTestListener.h>

int main(int, const char *[]) {
  NullTestResultsOutput output;
  TestRunner runner(output);
  CodewarsTestListener listener;
  runner.AddListener(&listener);
  runner.Run();
}

Not that I am proposing such change at the moment, I am just trying to figure out dependencies between snippets and what can potentially pollute what etc.

@wtlgo
Copy link

wtlgo commented Apr 27, 2021

@hobovsky asked me to share my thoughts about the proper usage of the <random> library.
So, here they are

  • The C-way of generating random data rand() % (a - b) + b should be avoided unless it's really necessary. It's outdated.
  • PRNG should be defined once per Describe unless, of course, it's necessary to have several PRNGs.
  • std::random_device should not be used as the main source of random data. but, as intended by the standard, as a seeder for a proper PRNG (std::mt19937, std::minstd_rand0, std::ranlux48_base, etc.). e.g. The common way of doing this is std::mt19937 engine{ std::random_device{}() };. Worth noticing, that it's better to keep the std::random_device instance if you're going to create more than one PRNG per Describe.
  • PRNG should not be seeded with time, it's a bad practice.
  • If the distribution instance does not change between tests, it should probably be defined once in the Describe body, as same as PRNG instance.
  • std::shuffle should be used to shuffle a container.
  • std::shuffle should not be used to pick random elements from the container. You must use std::sample instead.

Some examples from me (which I actually saw on CW)

// The tested function
int f(int a, int b = 0) {
    return a + b;
}
Describe(Bad) {
    It(c_style_rand) {
        srand(time(NULL));
        for(int i = 0; i < 100; ++i) {
             const int n = rand() % (100 - 1) + 1; // Please, let it go
             Assert::That(f(n), Equals(n)); 
        }
    }
};
Describe(Bad) {
private:
    int rand1_100() {
        std::mt19937 engine{ std::random_device{}() };
        return std::uniform_int_distribution<int>{ 1, 100 }(engine);
    }

public:
    It(create_pnrg_multiple_times) {
        for(int i = 0; i < 100; ++i) {
             const int n = rand1_100(); // PRNG is recreated 100 times.
             Assert::That(f(n), Equals(n)); 
        }
    }
};
Describe(Bad) {
private:
    std::mt19937 engine{ std::random_device{}() };

    int rand(const int a, const int b) {
        return std::uniform_int_distribution<int>{ a, b }(engine);
    }

public:
    It(create_distribution_multiple_times) {
        for(int i = 0; i < 100; ++i) {
             const int n = rand(1, 100); // Distribution range doesn't really change between cases but is recreated 100 times
             Assert::That(f(n), Equals(n)); 
        }
    }
};
Describe(Bad) {
private:
    std::random_device engine; // Shorter to type, but not reliable
    std::uniform_int_distribution<int> dist{ 1, 100 };

public:
    It(random_device_as_main_rng) {
        for(int i = 0; i < 100; ++i) {
             const int n = dist(engine);
             Assert::That(f(n), Equals(n)); 
        }
    }
};
Describe(Bad) {
private:
    std::mt19937 engine{ time(NULL) }; // It's a bad practice
    std::uniform_int_distribution<int> dist{ 1, 100 };

public:
    It(time_as_seed) {
        for(int i = 0; i < 100; ++i) {
             const int n = dist(engine);
             Assert::That(f(n), Equals(n)); 
        }
    }
};
Describe(Bad) {
private:
    std::mt19937 engine{ chrono::system_clock::now().time_since_epoch().count() }; // As bad as previous, but harder to read
    std::uniform_int_distribution<int> dist{ 1, 100 };

public:
    It(time_as_seed) {
        for(int i = 0; i < 100; ++i) {
             const int n = dist(engine);
             Assert::That(f(n), Equals(n)); 
        }
    }
};
Describe(Bad) {
private:
    std::mt19937 engine{ std::random_device{}() };

public:
    It(random_shuffle) {
        std::array<int, 100> arr;
        std::generate_n(arr.begin(), 100, [n = 0]() mutable { return ++n; });
       
        // Inventing the wheel.
        for(int i = 0; i < 100; ++i) {
            std::uniform_int_distribution<int> dist{ i, 99 };
            std::swap(arr[i], arr[dist(engine)]);
        }

        for(int i = 0; i < 100; ++i) {
             const int n = arr[i];
             Assert::That(f(n), Equals(n)); 
        }
    }
};
Describe(Bad) {
private:
    std::mt19937 engine{ std::random_device{}() };

public:
    It(pick_elements_with_shuffle) {
        std::array<int, 100> arr;
        std::generate_n(arr.begin(), 100, [n = 0]() mutable { return ++n; });

        for(int i = 0; i < 100; ++i) {
             std::shuffle(arr.begin(), arr.end()); // Performs a ton of job for just getting two numbers
             Assert::That(f(arr[0], arr[1]), Equals(arr[0] + arr[1])); 
        }
    }
};

And how it (in my opinion) should be

Describe(Good) {
private:
    std::mt19937 engine{ std::random_device{}() };
    std::uniform_int_distribution<int> dist{ 1, 100 };

public:
    It(random_numbers) {
        for(int i = 0; i < 100; ++i) {
             const int n = dist(engine);
             Assert::That(f(n), Equals(n)); 
        }
    }

    It(random_sequence) {
        std::array<int, 100> arr;
        std::generate_n(arr.begin(), 100, [n = 0]() mutable { return ++n; });
       
        std::shuffle(arr.begin(), arr.end(), engine);

        for(int i = 0; i < 100; ++i) {
             const int n = arr[i];
             Assert::That(f(n), Equals(n)); 
        }
    }

    It(random_pick) {
        std::array<int, 100> arr;
        std::generate_n(arr.begin(), 100, [n = 0]() mutable { return ++n; });
       
        int n[2];
        for(int i = 0; i < 100; ++i) {
             std::sample(arr.cbegin(), arr.cend(), n, 2, engine);
             Assert::That(f(n[0], n[1]), Equals(n[0] + n[1])); 
        }
    }
};

Also, not part of the main thing, but maybe a bit of useful advice. To avoid writing (engine) every single time, I usually do this

Describe(WhoKnows) {
private:
     std::mt19937 engine{ std::random_device{}() };
     std::function<int()> rand_number = std::bind(std::uniform_int_distribution<int>{ 1, 100 }, engine);
     std::function<size_t()> rand_length = std::bind(std::uniform_int_distribution<size_t>{ 20, 100 }, engine);
     std::function<double()> rand_coefficient = std::bind(std::uniform_real_distribution<double>{ -1, 1 }, engine);
     std::function<char()> rand_letter = std::bind(std::uniform_int_distribution<char>{ 'a', 'z' }, engine);
     std::function<bool()> rand_bool = std::bind(std::uniform_int_distribution<char>{ 0, 1 }, engine);
     // and etc.
};

After you can call them directly by rand_number() or rand_bool() not thinking about the engine at all. I find it comfortable.

That's it. I hope, I didn't miss anything.

@error256
Copy link
Contributor

Random tests don't need ideal distributions, thread safety or anything like that, so the recommendations above are more like what examples in the guide probably should use, but I don't see anything bad if someone likes to use rand as long as it's correct and doesn't make code overcomplicated.
Speaking of best practices outside of CW, isn't function outdated for most of its possible use cases? If I understand it correctly, it doesn't have type info, so it can't be inlined of optimized otherwise, it doesn't know the size of bound variables, so it has to allocate memory dynamically...

@hobovsky
Copy link
Contributor Author

hobovsky commented Apr 29, 2021

I am not bothered at all by quality of random data generated with rand. It's perfectly sufficient for use cases of kata.
Main reason why I think it would be worthwhile to discourage rand is because authors tend to get it terribly wrong when it comes to anything more complex than rand() % 100. Seeding is done improperly, bounds of ranges are off, randomization of values larger than RAND_MAX or of other types (unsigned, floating point) take terribly twisted (and error-prone) forms.

When reviewing, I would be perfectly fine with correctly used rand. I am not sure if it's worth promoting though, and how to put it in the docs. "You can use rand as long as you do this correctly." Duh! Of course I do! (proceeds to generate digits of a string to parse it to unsigned long long value).

@Voileexperiments
Copy link

From the standpoint of a less-honest user srand is highly exploitable so rand definitely should be avoided in C++ (since it's easy to do this). In C I don't think a universal alternative exist (/dev/urandom aren't available in Windows, etc).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
documentation Improvements or additions to documentation kind/tutorial New Tutorial lancuage/cpp Articles related to C++
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants