EXCEL VBAで2つの1次元配列が等しいか調べ,配列の差分を求める

 EXCEL VBA において,2 つの配列が等しいか否か調べたいことはよくあります.更に2つの配列の差分,つまり互いに重複しない要素を取り出したいという需要もあります.今回は配列の差分を求めるコードを紹介します.

 2つの配列が等しいかをまず調べ,等しくない場合に次に一方が他方の真部分集合であるか否かを調べます.最後に配列の差分を求めます.この順になっているのは全く同じ2つの配列の差分は空集合となって解が求まらないためであり,引き算される配列が引き算する配列の真部分集合の場合にも結果が空集合となり解が求まらないためです.

Array1&Array2

 Array1 の配列要素全体を A, Array2 の配列要素全体を B とすると Array1 から Array2 を差し引いた差分は下式で表現されます.

Array1 - Array2 = A\ AND\ (not B) = A \cap \neg B

 二重ループによりそれぞれの要素を比較して一致した数を求めていますが,これは積集合を取っていることに他なりません.EXCELワークシートにおいて数式の参照元の最初のセルおよび参照先の最後のセルを取得するでも述べましたが,2つの集合の和集合と積集合の要素数が等しければ元の2つの集合は全く同じです.ここでのロジックは積集合の要素数と元の集合の要素数が等しければ元の2つの集合は等しい筈であるとの前提に立っています.

Option Explicit

Function COMPARE_ARRAY(ByRef Array1() As String, ByRef Array2() As String) As Boolean
    Dim i   As Long
    Dim j   As Long
    Dim k   As Long
    k = 0
    For i = LBound(Array1) To UBound(Array1)
        For j = LBound(Array2) To UBound(Array2)
            If Array1(i) = Array2(j) Then
                k = k + 1
            End If
        Next j
    Next i
    If i = j And i = k Then
        COMPARE_ARRAY = True
    Else
        COMPARE_ARRAY = False
    End If
End Function

Function PROPERSUBSET_ARRAY(ByRef Array1() As String, ByRef Array2() As String) As Boolean
    Dim i   As Long
    Dim j   As Long
    Dim k   As Long
    k = 0
    For i = LBound(Array1) To UBound(Array1)
        For j = LBound(Array2) To UBound(Array2)
            If Array1(i) = Array2(j) Then
                k = k + 1
            End If
        Next j
    Next i
    If k = i Then
        PROPERSUBSET_ARRAY = True
    Else
        PROPERSUBSET_ARRAY = False
    End If
End Function

 下記関数では第 3 引数の Compare の指定により Array1 から Array2 の差分を求めるか,Array2 から Array1 の差分を求めるかを指定しています.但し,配列内の要素にもともと空白文字列があった場合にはうまく動作しません.

Function EXCEPT_ARRAY(ByRef Array1() As String, ByRef Array2() As String, ByVal Compare As Boolean) As String()
    Dim i               As Long
    Dim j               As Long
    Dim k               As Long
    Dim InternalAr1()   As String
    Dim InternalAr2()   As String
    Dim Ar1Ar()         As String
    Dim Ar2Ar()         As String
    InternalAr1 = Array1
    InternalAr2 = Array2
    For i = LBound(InternalAr1) To UBound(InternalAr1)
        For j = LBound(InternalAr2) To UBound(InternalAr2)
            If InternalAr1(i) = InternalAr2(j) Then
                InternalAr1(i) = ""
                InternalAr2(j) = ""
                k = k + 1
            End If
        Next j
    Next i
    k = 0
    For i = LBound(InternalAr1) To UBound(InternalAr1)
        If InternalAr1(i) = "" Then
        Else
            ReDim Preserve Ar1Ar(k)
            Ar1Ar(k) = InternalAr1(i)
            k = k + 1
        End If
    Next i
    k = 0
    For j = LBound(InternalAr2) To UBound(InternalAr2)
        If InternalAr2(j) = "" Then
        Else
            ReDim Preserve Ar2Ar(k)
            Ar2Ar(k) = InternalAr2(j)
            k = k + 1
        End If
    Next j
    If Compare Then
        EXCEPT_ARRAY = Ar1Ar
    Else
        EXCEPT_ARRAY = Ar2Ar
    End If
End Function

参照:
EXCELワークシートにおいて数式の参照元の最初のセルおよび参照先の最後のセルを取得する

How to compare and get differences between 2 arrays in EXCEL VBA?

When you’d like to compare 2 arrays in EXCEL VBA whether they are equal or not, what would you do? Furthermore, you might get different elements that don’t overlap each other. I’d like to describe the code that how to get differences of 2 arrays.

At first, you would have to check whether they are equal each other or not because the difference of equal arrays is empty set. Next, you would have to check whether an array is proper subset of another array or not. At last, you could get difference between 2 arrays.

Array1&Array2

See formula as below, it is shown all elements of Array1 as “A”, all elements of Array2 as “B” and differences elements from Array1 and Array2 as “A And (not B)”.

Array1 - Array2 = A\ AND\ (not B) = A \cap \neg B

The following code compares elements between two set and get number of equal elements with double loop, that means to get intersection of 2 sets.

Option Explicit

Function COMPARE_ARRAY(ByRef Array1() As String, ByRef Array2() As String) As Boolean
    Dim i   As Long
    Dim j   As Long
    Dim k   As Long
    k = 0
    For i = LBound(Array1) To UBound(Array1)
        For j = LBound(Array2) To UBound(Array2)
            If Array1(i) = Array2(j) Then
                k = k + 1
            End If
        Next j
    Next i
    If i = j And i = k Then
        COMPARE_ARRAY = True
    Else
        COMPARE_ARRAY = False
    End If
End Function

Function PROPERSUBSET_ARRAY(ByRef Array1() As String, ByRef Array2() As String) As Boolean
    Dim i   As Long
    Dim j   As Long
    Dim k   As Long
    k = 0
    For i = LBound(Array1) To UBound(Array1)
        For j = LBound(Array2) To UBound(Array2)
            If Array1(i) = Array2(j) Then
                k = k + 1
            End If
        Next j
    Next i
    If k = i Then
        PROPERSUBSET_ARRAY = True
    Else
        PROPERSUBSET_ARRAY = False
    End If
End Function

Configuration of the 3rd argument “Compare” specifies which array should be excepted, when it was TRUE the function would except the latter from the former, when it was FALSE then it would except the former from the latter, respectively. It’s assumed that the elements of arrays has no empty string.

Function EXCEPT_ARRAY(ByRef Array1() As String, ByRef Array2() As String, ByVal Compare As Boolean) As String()
    Dim i               As Long
    Dim j               As Long
    Dim k               As Long
    Dim InternalAr1()   As String
    Dim InternalAr2()   As String
    Dim Ar1Ar()         As String
    Dim Ar2Ar()         As String
    InternalAr1 = Array1
    InternalAr2 = Array2
    For i = LBound(InternalAr1) To UBound(InternalAr1)
        For j = LBound(InternalAr2) To UBound(InternalAr2)
            If InternalAr1(i) = InternalAr2(j) Then
                InternalAr1(i) = ""
                InternalAr2(j) = ""
                k = k + 1
            End If
        Next j
    Next i
    k = 0
    For i = LBound(InternalAr1) To UBound(InternalAr1)
        If InternalAr1(i) = "" Then
        Else
            ReDim Preserve Ar1Ar(k)
            Ar1Ar(k) = InternalAr1(i)
            k = k + 1
        End If
    Next i
    k = 0
    For j = LBound(InternalAr2) To UBound(InternalAr2)
        If InternalAr2(j) = "" Then
        Else
            ReDim Preserve Ar2Ar(k)
            Ar2Ar(k) = InternalAr2(j)
            k = k + 1
        End If
    Next j
    If Compare Then
        EXCEPT_ARRAY = Ar1Ar
    Else
        EXCEPT_ARRAY = Ar2Ar
    End If
End Function

REFERENCE:
How to get first cell which is referred to formula and last cell which refers to formula in Excel worksheet?

複数のExcelブックの全シートのフィルターを解除する

 オートフィルターの設定されているシートを手作業でフィルター解除するのは手間がかかります.VBA から GetOpenFilename メソッドで複数のブックを指定し,すべてのシートからオートフィルターを解除する方法です.

Option Explicit

Sub RemoveAutoFilter()
    Dim Wb      As Workbook
    Dim Sh      As Worksheet
    Dim myPath  As Variant
    Dim i       As Long
    Application.ScreenUpdating = False
    myPath = Application.GetOpenFilename(FileFilter:="Microsoft Excelブック,*.xls?", MultiSelect:=True)
    If TypeName(myPath) = "Boolean" Then Exit Sub
    For i = LBound(myPath) To UBound(myPath)
        Set Wb = Workbooks.Open(myPath(i))
        For Each Sh In Wb.Worksheets
            If Sh.AutoFilterMode Then
                Sh.AutoFilterMode = False
            End If
        Next Sh
        Wb.Save
        Wb.Close
    Next i
    Application.ScreenUpdating = True
End Sub

参照:
Application.GetOpenFilename メソッド (Excel)

How to remove AutoFilter of all worksheets in multiple Excel workbooks?

It’s hard to remove manually AutoFilter of worksheets in multiple workbooks. I’d like to describe how to remove it with VBA.

Option Explicit

Sub RemoveAutoFilter()
    Dim Wb      As Workbook
    Dim Sh      As Worksheet
    Dim myPath  As Variant
    Dim i       As Long
    Application.ScreenUpdating = False
    myPath = Application.GetOpenFilename(FileFilter:="Microsoft Excel Book,*.xls?", MultiSelect:=True)
    If TypeName(myPath) = "Boolean" Then Exit Sub
    For i = LBound(myPath) To UBound(myPath)
        Set Wb = Workbooks.Open(myPath(i))
        For Each Sh In Wb.Worksheets
            If Sh.AutoFilterMode Then
                Sh.AutoFilterMode = False
            End If
        Next Sh
        Wb.Save
        Wb.Close
    Next i
    Application.ScreenUpdating = True
End Sub

Reference:
Application.GetOpenFilename Method (Excel)

Bayesian information criterion

If sample size (n) was so large enough, Bayesian information criterion (BIC), an evaluation criteria of model estimated by maximum likelihood method, would be approximated with Laplace method by integrating marginal likelihood corresponding to posterior probability of model. θ is a parameter of p-dimension and f(xn|θ) is probability distribution function, respectively.

\displaystyle BIC = -2\log f(x_n|\hat\theta) + p\log n

References:
Probability density function, expected value and variance of each probability distribution
How to calculate Akaike information criterion with probability distribution function?

ベイズ型情報量基準

 ベイズ型情報量基準 (BIC) は最尤法によって推定されたモデルの評価基準であり,サンプルサイズ n が十分に大きい時にモデルの事後確率に対応する周辺尤度を積分のラプラス法で近似して得られます.θ は p 次元パラメータ,f(xn|θ) は確率分布関数です.赤池情報量基準との違いは罰則項の係数が AIC では 2 に固定してあったのに対し, BIC ではサンプルサイズ n の自然対数を乗じているところにあります.

\displaystyle BIC = -2\log f(x_n|\hat\theta) + p\log n

参照:
確率分布ごとの確率密度関数および期待値と分散
赤池情報量基準(AIC)を確率分布関数から最尤法を用いて計算する

How to calculate Akaike information criterion with probability distribution function?

Akaike information criterion (ACI) is the most useful indicator to select variables in multivariate analysis. It’s assumed that N is free parameter number, ACI is calculated as below;

\displaystyle AIC = -2(Maximum\ Log\ Likelihood)+2N

Free parameter number of model is dimension of the space that parameter value could take in expected models. AIC is an evaluation criterion when expected model is estimated with maximum likelihood method and it indicates that log likelihood bias approximates to free parameter number included in model.

How to find maximum log likelihood? Let’s define log likelihood function as following equation;

\displaystyle l(\theta) = \sum_{\alpha=1}^{n}\log f(x_{\alpha}|\theta)

\hat\theta, that is maximum likelihood estimator, maximizes l(θ) and this is called as maximum-likelihood method. l(\hat\theta) = \Sigma_{\alpha=1}^{n}\log f(x_\alpha |\hat\theta) is called as maximum log-likelihood.

If log likelihood function (l(θ)) could be differentiable, maximum likelihood estimator (\hat\theta) would be given by solving differentiated likelihood equation.

\displaystyle \frac{\partial l(\theta)}{\partial \theta} = 0

References:
Probability density function, expected value and variance of each probability distribution

赤池情報量基準(AIC)を確率分布関数から最尤法を用いて計算する

 多変量解析の際の変数選択の一つの指標として赤池情報量基準 (Akaike information criterion) があります.詳細は成書を参考にしていただきたいのですが,N を自由パラメータ数とすると下式で求まります.

\displaystyle AIC = -2(Maximum\ Log\ Likelihood)+2N

 モデルの自由パラメータ数とは,想定したモデルに含まれるパラメータの値が動く空間の次元のことです.AIC は想定したモデルを最尤法で推定した時の評価基準であり,対数尤度のバイアスが漸近的にモデルに含まれる自由パラメータ数となることを示しています.

 最大対数尤度はどう求めるのでしょうか.ここで下式のように対数尤度関数を定義します.f(x|θ) は確率分布関数であり,分布によって形が変化します.

\displaystyle l(\theta) = \sum_{\alpha=1}^{n}\log f(x_{\alpha}|\theta)

 この l(θ) を最大化する \hat\theta が最尤推定量であり,この方法を最尤法といいます.l(\hat\theta) = \Sigma_{\alpha=1}^{n}\log f(x_\alpha |\hat\theta) を最大対数尤度と呼びます.

 対数尤度関数 l(θ) が微分可能な場合,最尤推定量 \hat\theta は尤度方程式を微分した解が 0 となる θ を求めることで求まります.

\displaystyle \frac{\partial l(\theta)}{\partial \theta} = 0

参照:
確率分布ごとの確率密度関数および期待値と分散