Thanks to visit codestin.com
Credit goes to Github.com

Skip to content

mwsohn/FreqTools.jl

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

153 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

FreqTools

A package that produces frequency tables and associated summary tables. It mostly is a wrapper for the excellent FreqTables.jl package. tab is the main function. It is intended for interactive use and does not return any values.

Installation

  ] add FreqTools

Syntax

  • tab(df::AbstractDataFrame, vars::Union{String,Symbol}...; skipmissing = true, sort = nothing, pct = :rce, summarize = nothing)
  • tab(na::NamedArray; skipmissing = true, pct = :rce)
  • tab(m::Matrix)

Produce an one-way or two-way frequency table from a DataFrame or a NamedArray returned from freqtable function. tab is mainly a wrapper for the excellent FreqTables package.

If a matrix of counts is used as an argument, a Pearson chi-square test will be performed.

This package reexports FreqTables.jl.

Options:

  • skipmissing - set to false to include missing values in the frequency table (default = true)

  • sort - set it to true to sort the output table by the order of frequency with the largest category on top. It can only be used in the one-way table (default = false)

  • pct - set it to any combination of r (row), c (column), and/or e (cell) as a Symbol to request row, column, and cell percentages (default = :rce). For example, :rce will produce a table with row, column, and cell percentages (in that order) in the same table

  • summarize - specify a "continuous" variable to produce means, standard deviations, and counts in a one-way or two-way tables (default = nothing)

Examples

julia> using RDatasets

julia> lbw = dataset("COUNT","lbw");

julia> describe(lbw)
10Γ—7 DataFrame
 Row β”‚ variable  mean          min    median   max    nmissing  eltype   
     β”‚ Symbol    Float64       Int64  Float64  Int64  Int64     DataType
─────┼───────────────────────────────────────────────────────────────────
   1 β”‚ Low          0.312169       0      0.0      1         0  Int64
   2 β”‚ Smoke        0.391534       0      0.0      1         0  Int64
   3 β”‚ Race         1.84656        1      1.0      3         0  Int64
   4 β”‚ Age         23.2381        14     23.0     45         0  Int64
   5 β”‚ LWt        129.82          80    121.0    250         0  Int64
   6 β”‚ PTL          0.195767       0      0.0      3         0  Int64
   7 β”‚ Ht           0.0634921      0      0.0      1         0  Int64
   8 β”‚ UI           0.148148       0      0.0      1         0  Int64
   9 β”‚ FTV          0.793651       0      0.0      6         0  Int64
  10 β”‚ BWt       2944.29         709   2977.0   4990         0  Int64

1. One-way frequency table

1.a Frequencies

julia> tab(lbw, :Low)
───────┬───────────────────────────
   Low β”‚ Counts   Percent  Cum Pct 
───────┼───────────────────────────
     0 β”‚    130    68.783   68.783
     1 β”‚     59    31.217  100.000
───────┼───────────────────────────
 Total β”‚    189   100.000  100.000
───────┴───────────────────────────

1.b Summarize birthweight by smoking status

julia> tab(lbw, :Smoke, summarize = :BWt)
───────┬────────────────────────
 Smoke β”‚   N      Mean    StDev 
───────┼────────────────────────
     0 β”‚ 115  3054.957  752.409
     1 β”‚  74  2772.297  659.807
───────┼────────────────────────
 Total β”‚ 189  2944.286  729.016
───────┴────────────────────────

1.c Frequency sorted by counts

julia> tab(lbw,:Race, sort=true)
───────┬───────────────────────────
  Race β”‚ Counts   Percent  Cum Pct 
───────┼───────────────────────────
     1 β”‚     96    50.794   50.794
     3 β”‚     67    35.450   86.243
     2 β”‚     26    13.757  100.000
───────┼───────────────────────────
 Total β”‚    189   100.000  100.000
───────┴───────────────────────────

2. Two-way frequency table

2.a Frequency table with row, column, and cell percentages (default)

julia> tab(lbw, :Race, :Smoke)

──────────────┬───────────────────────────
 Race / Smoke β”‚       0        1    Total 
──────────────┼───────────────────────────
            1 β”‚      44       52       96
              β”‚  45.833   54.167  100.000
              β”‚  38.261   70.270   50.794
              β”‚  23.280   27.513   50.794
──────────────┼───────────────────────────
            2 β”‚      16       10       26
              β”‚  61.538   38.462  100.000
              β”‚  13.913   13.514   13.757
              β”‚   8.466    5.291   13.757
──────────────┼───────────────────────────
            3 β”‚      55       12       67
              β”‚  82.090   17.910  100.000
              β”‚  47.826   16.216   35.450
              β”‚  29.101    6.349   35.450
──────────────┼───────────────────────────
        Total β”‚     115       74      189
              β”‚  60.847   39.153  100.000
              β”‚ 100.000  100.000  100.000
              β”‚  60.847   39.153  100.000
──────────────┴───────────────────────────
Pearson chi-square = 21.7790 (2), p < 0.0001

2.b Pearson chi-square test for a matrix of counts

julia> tab([44 52 ; 16 10; 55 12])
───────┬───────────────────────────
 A / B β”‚       1        2    Total 
───────┼───────────────────────────
     1 β”‚      44       52       96
       β”‚  45.833   54.167  100.000
       β”‚  38.261   70.270   50.794
       β”‚  23.280   27.513   50.794
───────┼───────────────────────────
     2 β”‚      16       10       26
       β”‚  61.538   38.462  100.000
       β”‚  13.913   13.514   13.757
       β”‚   8.466    5.291   13.757
───────┼───────────────────────────
     3 β”‚      55       12       67
       β”‚  82.090   17.910  100.000
       β”‚  47.826   16.216   35.450
       β”‚  29.101    6.349   35.450
───────┼───────────────────────────
 Total β”‚     115       74      189
       β”‚  60.847   39.153  100.000
       β”‚ 100.000  100.000  100.000
       β”‚  60.847   39.153  100.000
───────┴───────────────────────────
Pearson chi-square = 21.7790 (2), p < 0.0001

2.c Frequency table with no percentages

julia> tab(lbw,:Race,:Smoke, pct = nothing )
──────────────┬────────────────
 Race / Smoke β”‚   0   1  Total 
──────────────┼────────────────
            1 β”‚  44  52     96
            2 β”‚  16  10     26
            3 β”‚  55  12     67
──────────────┼────────────────
        Total β”‚ 115  74    189
──────────────┴────────────────
Pearson chi-square = 21.7790 (2), p < 0.0001

2.d Frequency table with row percentages alone

julia> tab(lbw, :Race, :Smoke, pct = :r)
──────────────┬─────────────────────────
 Race / Smoke β”‚      0       1    Total 
──────────────┼─────────────────────────
            1 β”‚     44      52       96
              β”‚ 45.833  54.167  100.000
──────────────┼─────────────────────────
            2 β”‚     16      10       26
              β”‚ 61.538  38.462  100.000
──────────────┼─────────────────────────
            3 β”‚     55      12       67
              β”‚ 82.090  17.910  100.000
──────────────┼─────────────────────────
        Total β”‚    115      74      189
              β”‚ 60.847  39.153  100.000
──────────────┴─────────────────────────
Pearson chi-square = 21.7790 (2), p < 0.0001

2.e Frequency table with column percentages alone

julia> tab(lbw, :Race, :Smoke, pct = :c)
──────────────┬───────────────────────────
 Race / Smoke β”‚       0        1    Total 
──────────────┼───────────────────────────
            1 β”‚      44       52       96
              β”‚  38.261   70.270   50.794
──────────────┼───────────────────────────
            2 β”‚      16       10       26
              β”‚  13.913   13.514   13.757
──────────────┼───────────────────────────
            3 β”‚      55       12       67
              β”‚  47.826   16.216   35.450
──────────────┼───────────────────────────
        Total β”‚     115       74      189
              β”‚ 100.000  100.000  100.000
──────────────┴───────────────────────────
Pearson chi-square = 21.7790 (2), p < 0.0001

2.f Frequency table with cell percentages alone

julia> tab(lbw, :Race, :Smoke, pct = :e)
──────────────┬─────────────────────────
 Race / Smoke β”‚      0       1    Total 
──────────────┼─────────────────────────
            1 β”‚     44      52       96
              β”‚ 23.280  27.513   50.794
──────────────┼─────────────────────────
            2 β”‚     16      10       26
              β”‚  8.466   5.291   13.757
──────────────┼─────────────────────────
            3 β”‚     55      12       67
              β”‚ 29.101   6.349   35.450
──────────────┼─────────────────────────
        Total β”‚    115      74      189
              β”‚ 60.847  39.153  100.000
──────────────┴─────────────────────────
Pearson chi-square = 21.7790 (2), p < 0.0001

2.g Frequency table with row and column percentages

julia> tab(lbw, :Race, :Smoke, pct = :rc)
──────────────┬───────────────────────────
 Race / Smoke β”‚       0        1    Total 
──────────────┼───────────────────────────
            1 β”‚      44       52       96
              β”‚  45.833   54.167  100.000
              β”‚  38.261   70.270   50.794
──────────────┼───────────────────────────
            2 β”‚      16       10       26
              β”‚  61.538   38.462  100.000
              β”‚  13.913   13.514   13.757
──────────────┼───────────────────────────
            3 β”‚      55       12       67
              β”‚  82.090   17.910  100.000
              β”‚  47.826   16.216   35.450
──────────────┼───────────────────────────
        Total β”‚     115       74      189
              β”‚  60.847   39.153  100.000
              β”‚ 100.000  100.000  100.000
──────────────┴───────────────────────────
Pearson chi-square = 21.7790 (2), p < 0.0001

2.h Frequency table with summary values (mean, standard deviation, and count) of another variable

julia> tab(lbw, :Race, :Smoke, summarize = :BWt)
──────────────┬──────────────────────────────
 Race / Smoke β”‚        0         1     Total 
──────────────┼──────────────────────────────
            1 β”‚ 3428.750  2827.385  3103.010
              β”‚  710.099   626.684   727.872
              β”‚       44        52        96
──────────────┼──────────────────────────────
            2 β”‚ 2854.500  2504.000  2719.692
              β”‚  621.254   637.057   638.684
              β”‚       16        10        26
──────────────┼──────────────────────────────
            3 β”‚ 2814.236  2757.167  2804.015
              β”‚  708.261   810.045   721.301
              β”‚       55        12        67
──────────────┼──────────────────────────────
        Total β”‚ 3054.957  2772.297  2944.286
              β”‚  752.409   659.807   729.016
              β”‚      115        74       189
──────────────┴──────────────────────────────

3. Three-way frequency table

3.a Frequency table

julia> tab(lbw, :Race, :Smoke, :UI)


UI = 0

──────────────┬───────────────────────────
 Race / Smoke β”‚       0        1    Total 
──────────────┼───────────────────────────
            1 β”‚      40       43       83
              β”‚  48.193   51.807  100.000
              β”‚  40.000   70.492   51.553
              β”‚  24.845   26.708   51.553
──────────────┼───────────────────────────
            2 β”‚      13       10       23
              β”‚  56.522   43.478  100.000
              β”‚  13.000   16.393   14.286
              β”‚   8.075    6.211   14.286
──────────────┼───────────────────────────
            3 β”‚      47        8       55
              β”‚  85.455   14.545  100.000
              β”‚  47.000   13.115   34.161
              β”‚  29.193    4.969   34.161
──────────────┼───────────────────────────
        Total β”‚     100       61      161
              β”‚  62.112   37.888  100.000
              β”‚ 100.000  100.000  100.000
              β”‚  62.112   37.888  100.000
──────────────┴───────────────────────────
Pearson chi-square = 19.8732 (2), p < 0.0001


UI = 1

──────────────┬───────────────────────────
 Race / Smoke β”‚       0        1    Total 
──────────────┼───────────────────────────
            1 β”‚       4        9       13
              β”‚  30.769   69.231  100.000
              β”‚  26.667   69.231   46.429
              β”‚  14.286   32.143   46.429
──────────────┼───────────────────────────
            2 β”‚       3        0        3
              β”‚ 100.000    0.000  100.000
              β”‚  20.000    0.000   10.714
              β”‚  10.714    0.000   10.714
──────────────┼───────────────────────────
            3 β”‚       8        4       12
              β”‚  66.667   33.333  100.000
              β”‚  53.333   30.769   42.857
              β”‚  28.571   14.286   42.857
──────────────┼───────────────────────────
        Total β”‚      15       13       28
              β”‚  53.571   46.429  100.000
              β”‚ 100.000  100.000  100.000
              β”‚  53.571   46.429  100.000
──────────────┴───────────────────────────
Pearson chi-square = 6.1449 (2), p = 0.0463075

3.b Summarize birthweight (BWt) by three variables

julia> tab(lbw, :Race, :Smoke, :UI, summarize = :BWt)


UI = 0

──────────────┬──────────────────────────────
 Race / Smoke β”‚        0         1     Total 
──────────────┼──────────────────────────────
            1 β”‚ 3494.500  2874.163  3173.120
              β”‚  622.518   634.652   698.474
              β”‚       40        43        83
──────────────┼──────────────────────────────
            2 β”‚ 3002.615  2504.000  2785.826
              β”‚  583.831   637.057   644.843
              β”‚       13        10        23
──────────────┼──────────────────────────────
            3 β”‚ 2884.617  3104.750  2916.636
              β”‚  700.495   407.027   667.539
              β”‚       47         8        55
──────────────┼──────────────────────────────
        Total β”‚ 3143.910  2843.721  3030.174
              β”‚  711.463   625.409   693.696
              β”‚      100        61       161
──────────────┴──────────────────────────────


UI = 1

──────────────┬──────────────────────────────
 Race / Smoke β”‚        0         1     Total 
──────────────┼──────────────────────────────
            1 β”‚ 2771.250  2603.889  2655.385
              β”‚ 1247.209   566.666   780.654
              β”‚        4         9        13
──────────────┼──────────────────────────────
            2 β”‚ 2212.667         .  2212.667
              β”‚  298.329         .   298.329
              β”‚        3         0         3
──────────────┼──────────────────────────────
            3 β”‚ 2400.750  2062.000  2287.833
              β”‚  645.396  1026.103   761.602
              β”‚        8         4        12
──────────────┼──────────────────────────────
        Total β”‚ 2461.933  2437.154  2450.429
              β”‚  772.723   738.281   742.977
              β”‚       15        13        28
──────────────┴──────────────────────────────

About

Tools to generate frequency tables

Resources

License

Stars

Watchers

Forks

Packages

No packages published

Languages