Quantification in Logic Tensor Networks
Base quantification
The logical quantifiers, namely \(\forall\) and \(\exists\), are implemented in LTN using fuzzy aggregators. An
example of fuzzy aggregator is the mean. The ltn.fuzzy_ops
module contains the implementation
of the most common fuzzy aggregators.
In the note regarding the LTN broadcasting, we have seen that the output of logical predicates, for example \(P(x, y)\), is organized in a tensor where the dimensions are related with the free variables appearing in the predicate. In the case of the atomic formula \(P(x, y)\), the result is a tensor \(\mathbf{out} \in [0., 1.]^{m \times n}\), where \(m\) is the number of individuals of variable \(x\), while \(n\) is the number of individuals of variable \(y\).
Notice that:
the tensor \(\mathbf{out}\) has two dimensions because the atomic formula \(P(x, y)\) has two free variables, namely \(x\) and \(y\);
variable \(x\) has been placed on the first dimension, while variable \(y\) in the second dimension, according to their order of appearance in the atomic formula;
the output of the predicate is wrapped inside an LTN object where the value attribute contains the tensor \(\mathbf{out}\), while the free_vars attribute contains the list [‘x’, ‘y’].
Now, we can extend this example with the formula: \(\forall x P(x, y)\). In order to compute the result of this formula, LTNtorch must compute the result of the atomic formula \(P(x, y)\) first, and then apply a fuzzy aggregator to this result for performing the desired quantification on variable \(x\).
We know that the output of \(P(x, y)\) is the tensor \(\mathbf{out} \in [0., 1.]^{m \times n}\), and that the first dimension is related to variable \(x\) while second dimension to variable \(y\). In order to perform the quantification of \(P(x, y)\) on variable \(x\), LTNtorch simply performs an aggregation of tensor \(\mathbf{out}\) on the first dimension. Let us assume that the selected fuzzy aggregator for \(\forall\) is the mean. LTNtorch simply computes torch.mean(out, dim=0), where out is the tensor \(\mathbf{out}\) and dim=0 means that we aggregate on the first dimension of out, namely the dimension related to variable \(x\).
After the application of the quantifier, a new LTN object is created. The attribute value will contain the result of \(\forall x P(x, y)\), that will be the tensor \(\mathbf{out}' \in [0., 1.]^n\), while the attribute free_vars will contain the list [‘y’].
Notice that:
the result has one less dimension within respect to the result of \(P(x, y)\). This is due to the fact that we have aggregated one of the two dimensions, namely the dimension of \(x\). The only dimension left is the dimension related with \(y\), and it is for this reason that \(\mathbf{out}\) is now a vector in \([0., 1.]^n\);
the attribute free_vars contains now only variable \(y\). This is due to the fact that variable \(x\) is not free anymore since it has been quantified by \(\forall\).
Finally, notice that if the formula would have been \(\forall x \forall y P(x, y)\), the quantification would have been torch.mean(out, dim=(0,1)), namely both dimensions of \(\mathbf{out}\) would have been aggregated. Also, the output would have been a scalar in \([0., 1.]\) and no more free variables would have been in the formula, since both variables have been quantified.
Diagonal quantification
Given 2 (or more) variables, there are scenarios where one wants to express statements about specific pairs (or tuples) of values only, instead of all the possible combinations of values of the variables. This is allowed in LTN thanks to the concept of diagonal quantification.
To make the concept clear, let us make a simple example. Suppose that we have the formula \(\forall x \forall y P(x, y)\). Suppose also that variables \(x\) and \(y\) have the same number of individuals, and that this number is \(n\).
With the usual LTN broadcasting, the predicate \(P(x, y)\) is computed on all the possible combinations of individuals of \(x\) and \(y\). In other words, the result of \(P(x, y)\) would be a tensor \(\mathbf{out} \in [0., 1.]^{n \times n}\). Then, after the quantification on both \(x\) and \(y\), we obtain a scalar in \([0., 1.]\).
With diagonal quantification (\(\forall Diag(x, y) P(x, y)\)), instead, predicate \(P(x, y)\) is computed only on the tuples \((x_i, y_i)\), where \(x_i\) is the \(i_{th}\) individual of \(x\), while \(y_i\) is the \(i_{th}\) individual of \(y\). In other words, the output would be a tensor \(\mathbf{out} \in [0., 1.]^n\). Notice that the output has one single dimension since the predicate has been computed on \(n\) tuples only, namely the tuples created constructing a one-to-one correspondence between the individuals of \(x\) and the individuals \(y\). At the end, after the quantification, a scalar is obtained like in the case with the previous case.
The advantages of diagonal quantification are mani-fold:
it is a way to disable the LTN broadcasting. Even if it has been designed to work with quantified statements, it could serve as a way to temporarily disable the LTN broadcasting when computing some formulas. In fact, sometimes it is not necessary to compute a predicate on all the possible combinations of individuals of the variables in input;
it is more efficient compared to the usual quantification since it allows to avoid to compute a predicate on all the possible combinations of individuals of the variables appearing in the predicate;
it is useful when dealing with variables which represent machine learning examples. In many tasks, the dataset comes with some labelled examples. One variable could contain the examples, while another variable could contain the labels of the examples. With diagonal quantification we are able to force LTNtorch to use these variables with a one-to-one correspondence. This allows to avoid to compute formulas on combination of individuals which do not make any sense, for example a data sample labelled with a wrong label.
Notice that diagonal quantification expects the variables to have the same number of individuals, since a one-to-one correspondence has to be created.
In order to use diagonal quantification in LTNtorch, it is possible to use ltn.core.diag()
.
Guarded quantification
In some cases, it could be useful to quantify formulas only on variables’ individuals which satisfy a given condition. This is allowed in LTN by using the guarded quantification.
To make the concept clear, let us make a simple example. Suppose that we have a binary predicate \(P(a, b)\). Then, we have two variables, \(x\) and \(y\), containing sequences of points in \(\mathbb{R}^2\). Specifically, \(\mathcal{G}(x) \in \mathbb{R}^{m_1 \times 2}\) and \(\mathcal{G}(y) \in \mathbb{R}^{m_2 \times 2}\). So, \(x\) contains \(m_1\) points, while \(m_2\) contains \(m_2\) points.
We have already seen that LTN allows to compute the formula: \(\forall x \forall y P(x, y)\). Also, we know that LTNtorch firstly computes \(P(x, y)\) and then aggregates on the dimensions specified by the quantified variables.
Suppose now that we want to compute the same formula \(\forall x \forall y P(x, y)\) but quantifying only on the pairs of points whose distance is lower than a certain threshold. We represent this threshold with the constant \(th\). In Real Logic, it is possible to formalize this statement as \(\forall x \forall y : (dist(x, y) < th) \text{ } P(x, y)\), where \(dist\) is a function which computes the Euclidean distance between two points in \(\mathbb{R}^2\).
In order to compute this formula, LTNtorch follows this procedure:
it computes the result of the atomic formula \(P(x, y)\), which is a tensor \(\mathbf{out} \in [0., 1.]^{m_1 \times m_2}\);
it masks the truth values in the tensor \(\mathbf{out}\) which do not satisfy the given condition (\(dist(x, y) < th\));
it aggregates the tensor \(mathbf{out}\) on both dimensions, since both variables have been quantified. In this aggregation, the truth values masked by the previous step are not considered. Since both variables have been quantified, the result is a scalar in \([0, 1]\).
For applying guarded quantification in LTNtorch, see ltn.core.Quantifier
. In particular, see mask_fn
and cond_vars parameters.