POV-Matter and Machinic POV between Affects and Umwelten – Mitra Azar

Mario Klingemann, My Artificial Muse. June 13, 2017.

Mario Klingemann, My Artificial Muse. June 13, 2017.

Reading note:

The text is a little longer compared to the guidelines. For reason of density I couldn’t cut it shorter then this. Please feel free to skip the last section (VI) on GANs, which is a case study that applies the theoretical framework developed in the previous sections.


POV-matter and machinic POV between affects and Umwelten

By Mitra Azar

“Could a machine think?

Could it be in pain?”[1]

L. Wittgenstain, Philosophical grammar.

I. POV, affects and Umwelt between phenomenology, cinema and machine vision

Within the broader attempt of embarking in a genealogy and an archeology of POV  (Point of View) which aims at providing a framework for analyzing the various regimes of visibility and regimes of truth[2] which coalesce together with different forms of POV and their corresponding Umwelten, I propose affect theory as the conceptual toolbox for mapping a cartography that goes from the formation of the first “centers of indetermination”[3] emerging from the pre-biotic soup after the inflation that follows the Big Bang[4], up to the latest developments in machine vision and AI. By doing so, I plan to engage with the famous Bergson’s statement “there’s no perception without affection”[5] (1896), trying to understand its meaning against a phenomenological, cinematic and machinic notion of POV – as much as against a notion of Umwelt (Uexküll, 1957) described via the affordances generated by the interaction between organisms, space, and technology[6].

Especially the reversibility between the couple seeing / seen[7] in Merleau-Ponty and the relation between the notion of eye and gaze[8] in Lacan are taken as a reference to understand the transformation between a phenomenological, cinematic, and machinic notion of POV, and are – as I hope to illustrate – surprisingly re-invented by a form of machine vision produced by Deep Convoluted Generative Adversarial Network, or DCGANs (for simplicity, GANs).

II. POV in cinema and phenomenology: reversibility between seeing / seen and split between eye / gaze

The expression POV is a technical acronym coming from the field of cinema[9], and refers to a type of image that allows the viewer to see what the character sees from the character’s perspective (or orientation). POV cinematic images simulate the movement of an actor within a space, creating a sense of continuity between viewers and what is viewed, as if viewers are ‘embodied’ in the images they’re looking at. In this sense, cinematic POV images generate the seamless overlapping between camera, actor’s body and spectator’s body, thus producing for the first time in the history of technology a form of seamless overlapping between the human and the technological. Furthermore, cinema articulates the relation between the spectator’s POV intended as the phenomenological orientation produced by an embodied agent in a physical space defined as a regime of light, with the regime of visibility produced by the cinematic analog machine. The very collapse and overlapping between the embodied agent’s regime of light and the regime of visibility generated by the cinematic analog machine is the main feature of the cinematic technics of POV.

From a phenomenological perspective, one of the main features of human POV is that of expressing a “worldly sensitivity”[10] visually characterized by the reversibility between the couple seeing / seen: I’m seeing the world around me but I’m also seen simultaneously by others, and this reversibility (together with the reversibility between touching / touched) is what defined my being-in-the-world[11], my embeddedness into an intersubjective world[12]. The potentially horizontal relation between seeing / seen is investigated in the context of Lacanian psychoanalysis in relation to the asymmetric relation between the eye and the gaze, according to which “I’m seeing only from one point [literally, a POV], but in my existence I’m looked at from all side”[13].

Cinema does something pretty interesting to these phenomenological categories: if it seems possible to say that cinema enforces the vertical relation between the eye and the gaze (the eye being the eye of the spectator and the gaze being the director’s “all-seeing”[14] eye), in the case of the cinematic technic of POV, eye and gaze collapse into each other. Thus, POV re-establishes the horizontal relation between seeing and seen (in this case between the seeing of the viewer and the seen of the actor on behalf of the seen of the director). Furthermore, cinematic POV produces a reversibility that happens in a form that generates immersion and embodiment beyond the surface of the screen (inward)[15].

These considerations will come useful in the closing section of this paper, where I will argue that both the reversibility between seeing and seen and the split between eye and gaze can be approached as relevant categories to understand machine vision and the functioning of GANs. Nevertheless, before looking specifically at how GANs so re-invent the notion of POV, let’s have a better look at the relationship between a phenomenological notion of POV and the concept of Umwelt described by bio-semiologist J. von Uexküll.

III. POV and Umwelten

In fact, POV is not something that comes with the human or with the technological. POV as the phenomenological production of an orientation has, instead – as mentioned at the beginning – cosmological origin. In this sense, POV refers to the very fact that since the formation of the first nuclei of protons and neutrons few millionth of a second after the Big Bang to the first electrons starting to spin around these nuclei, thus forming the first atoms 380,000 years later[16], the fundamental blocks of matter organize themselves producing orientations – technically referred to as spins[17]. Matter is always oriented, despite the organic / inorganic divide – and indeed can be generalized as POV-matter. Thinking of matter as POV-matter is not trivial, because it allows to think of POV as a cohesive phenomenological factor providing a possible foundation for a new (post)phenomenology to come[18], which discards the humanist and subjectivist dérive of classic phenomenology[19] and provides a pivotal concept to re-articulates the agential relation between the organic and the inorganic within a concept of POV-matter that modulates in relation to affects and Umwelten.

In a certain sense, it is also possible to argue that the notion of Umwelt cannot be thought without the phenomenological notion of POV. Umwelt is always in POV because it refers to a subjective (but not forcibly human) experience, and because of that it stays epistemically non-accessible[20] from the outside, even though observable – up to a certain degree[21].

To briefly sum up, if in relation to the formation of POV-matter in its atomic and molecular inorganic form I refer to a regime of visibility simply intended as a generic regime of light, when POV-matter turns into organic structures the regime of light turns into the visual field of a specific Umwelt. More precisely, POV-matter develops into organic forms that orienteer themselves in the space according to different evolutionary survival criteria, producing the organisms’ unique Umwelten[22], of which the organisms’ regimes of light represent the visual counterpart (or regime of visibility).

Once POV-matter turns into the expressive function of complex living organisms such as human beings, POV-matter turns into a cultural product. Here, POV-matter is defined by the interaction between a regime of light attached to the Umwelt of the organism and a regime of light attached to analog machine first (painting, analog photography and analog cinema) and to digital and algorithmic machines later. Especially in relation to these new regimes of visibility that work in absence of light[23] – such as the ones generated by digital photography and digital cinema, by the Google gaze circuit (Google Maps, Google Car, Google 360, Google Glass) and finally by machine vision – I propose to re-think the concept of POV, Umwelt and affect. Thus, before approaching machine vision and GANs POVs against a phenomenological and cinematic notion of POV, let’s have a look at the relationship between POV-matter and affects.

IV. POV-matter and affects

POV-matter is defined as a form of orienteering co-emerging with affects during the formation of the first proto-stable organic forms of life after the Big Bang and persisting – although changing drastically form – through the various sub-atomical, molecular, organic, human, cultural and machinic instantiations.

It is especially Bergson’s statement “there’s no perception without affection”[24] which provides a hint into the relation between POV and affect as much as a blueprint to approach machine vision, especially in the context of GANs.

The theme of affects emerges in Deleuze in relation to the attempt of categorizing cinematic images drawing from Pierce semiology and Bergson ontology[25]. From a universe of a-centered movement-images composed by images that “act and react on all their facets and in all their parts”[26], Deleuze unfolds the image-perception to name a type of image which “only receive[s] actions on one facet or in certain parts and only execute reactions by and in other parts”[27]: […] “the image reflected by a living image is precisely what will be called perception”[28]. The image-perception grounds its selective interaction with the surrounding by producing a gap between action and reaction, and by so doing an orientation. This orientation reflects the affordances appearing between the so emerging POV-matters and their surrounding Umwelten. Affect fills the seemingly empty gap defining the emergency of POV as the interval between action and reaction: “the interval is not merely defined by the specialization of the two limits facets, perceptive and active. There’s an in-between. Affection is what occupies the interval, what occupies it without filling it in or filling it up”[29].

Following this understanding of affect, and reducing it to its essence, I’d like to argue that affection can be intended as the figure for the coincidence between subject and object – or better said, for the becoming object of the subject for himself via self-perception. This is interesting for the argument I’m trying to build, because cinematic POV produces a similar result – the overlapping of the subject (audience / actor / director) with the object (camera / screen). POV and affect point towards the overlapping between the human and the machine, or between subject and object – producing a form of machinic embodiment[30]. Thus, POV turns into a tool for thinking about affects beyond the organic / inorganic divide, and vice-versa, and can be finally (and tentatively) approached in the context of AI and especially GANs.

V. POV and machines

“Now object perceives me”[31] stated poignantly Paul Klee in his diaries, as noticed by philosopher Paul Virilio in the opening of his Vision machine, somehow prophetically envisioning a world of object that learn how to see – and to “sense” – the surrounding space and the bodies occupying it.

New technologies of vision oriented towards new forms of data-veillance[32] such as the Google gaze circuit and even more poignantly machine vision, seems to give a technological consistency to Klee’s intuition. Moreover, they seem very well to confirm the a-symmetry Lacan locates at the very heart of our phenomenological intertwining[33] with the world, making visible the encompassing visual power of the (technological) gaze against the localized and punctual vision of the (human) eye. This a-symmetry is currently taking new forms that extend the capability of the gaze to “all-seeing”: bio-tracking technologies based on AI aim at quantifying a number of qualitative inputs that go, for example, from facial features and facial expressions to breathing pattern and heart bits, pointing at accessing the very gap between action and reaction defining a center of indetermination – or its emergency as an affective, embodied, enworlded POV. The technological gaze tries, thus, to vicariously access the eye in the form of an affective body and right at the very moment where it emerges as a POV. In this sense, new technologies of vision based on AI – 21 century media[34] in the wording of Mark Hansen – try to locate themselves at the very gap where the formation of a “worldly”[35] sensitivity emerges. By doing so – drawing from Hansen – this new media[36] grant humans the possibility of perceiving their very constitution as POV at a conscious level, beyond the so called missing-half-second defining instead the minimum temporal gap between a perception (or sensation) and its neural registration[37].

Thus, these technologies invite us to re-think the notion of affect and Umwelt in relation to this new imbrication between the human and the technological. If in cinematic POV the overlapping between human and technology produces the overlapping between the regime of light of an embodied POV and the regime of visibility of the cinematic machine over the surface of the screen opening inward – thus allowing the human inside –, with 21 century media the process looks similar but yet inverted. 21 century media produce the overlapping between the human and the machine by inserting the machine into the human, and not vice-versa. To do so, they attempt to access human POV by accessing the very gap where it emerges from – first breaching through the screen of the body (inward phase) and secondly extracting worldly data behind the human conscious threshold. Thus, machines access vicariously a bodily dimension, while humans are exposed to a quantified version of their very enworlded affective fabric, which – datified – contributes to the constitution of new forms of human-machinic Umwelt with complex political implications. One of the most significant change in relation to these new forms of Umwelt, is that the affordances between the human and the surrounding space are, thus, technically prehensed by a capture which re-defines affordances as such and which claims to carry out their very design in ways that fulfil subject’s expectations better than what the subject’s agential design could possibly do within the boundaries of its missing-half-second structural perceptive delay[38].

VI. Towards a phenomenological understanding of GANs

GANs is a form of unsupervised machine learning able to access raw data from the world and to build an understanding of them without the mediation of linguistic labeling enacted by humans (or mechanical turks[39]) tagging huge datasets of images and preparing them to train supervised machine learning algorithms[40]. GANs builds an understanding of raw data by establishing an antagonist relation between two neural networks, one generating data starting from a model (generator), the other discriminating the data generated on the base of the same model (discriminator). In a sense, generator and discriminator constitute each other’s through a visually based dialogic exchange that closely reminds both the intertwining of the coupling seeing / seen and the split between the eye and the gaze. Generator and discriminator sees each other and by doing so establish each other’s POV, while at the same time enacting the distinctive roles of the eye (generator) and of the gaze (discriminator). At the same time, what we might (hazardously) call a machinic form of Umwelt appears as the place of GANs affordances. GANs’Umwelt functions as the intersection between generator and discriminator originally diverse Umwelten and takes the form of a latent space[41]. The latent space can be addressed as a virtual screen where a differential recognition happens – a recognition based on the interplay between generator and discriminator’s different POVs – producing a form of machinic perception where the complexity of the intertwining between embodied POVs is reduced to a task-oriented statistically inducted capability of patter recognition – a feature typical of AI in general, according to Matteo Pasquinelli[42]. The latent space works as the gap where action and reaction of both generator and discriminator transition, and where the process of establishing consistency into both generator and discriminator’s POV is in its (algorithmic) becoming. Nevertheless, there’s no affect in this gap which emerges with the emergencies of the intertwining between generator and discriminator’s POV and between their respective originary Umwelten. As a consequence, following Bergson statement “there’s no perception without affection”, we can only metaphorically refer to perception – as much as to POVs and Umwelt. Affective computing and GANs push forward the question of the relation between POV, affects and Umwelt, and do so by algorithms that mimicry forms of phenomenological enworlding – either by vicariously accessing the body or by reproducing the intertwining characteristic of embodied POV. Artist Mario Klingemann[43] develops a body of work based on accessing GANs’ latent space, suspending the antagonist process taking place there between generator and discriminator’s POVs and interjecting a human POV into this machinic micro-gap – somehow turning it into a potential affective space where new forms of human-machinic expressivity can emerges.


[1] Wittgenstein, L. Philosophical Grammar. Maden: Blackwell, 1974. Print.

[2] I’ve started investigating the relation between POV, regimes of truth and games of truth in a paper presented at the 2018 After post-truth conference in Barcelona. Azar, M. “From Panopticon to POV-opticon: drive to visibility and games of truth”. A draft version of the paper can be found on academia.edu.

[3] Bergson, H. Matter and memory, London: George Allen and Unwin, 1911 (1896). Print.

[4] The early universe. cern.com. Web.

[5] Bergson, H. Matter and memory, London: George Allen and Unwin, 1911 (1896). Print.

[6] The role of technologies in the construction of techno-phenomenological Umwelten is addressed in the case of human Umwelten.

[7] Cfr. Merleau-Ponty, M. Phénoménologie de la perceptionParis: Gallimard, 1945. Print.

[8] Cfr. Lacan, J. Les quatre concepts fondamentaux de la psychanalyse. Paris: Le Seuil, 1973. Print.

[9] For a detailed explanation of POV in cinema and gaming, cfr. Galloway, A. “Origins of the First-Person Shooter”. In T. Corrigan, & P. White (Eds.), Critical Visions in Film Theory. New York: St. Martin’s Press, 2011. Print.

[10] Cfr. Hansen, M. Feed-forward. Chicago: University of Chicago Press, 2015, pg. 266. Print.

[11] Cfr. Heiddeger, M. Being and Time. Oxford: Blackwell, (1926) 1962. Print.

[12] Cfr. Merleau-Ponty, M. Phénoménologie de la perception. Paris: Gallimard, 1945. Print.

[13] Lacan, J. Les quatre concepts fondamentaux de la psychanalyse. Paris: Le Seuil, 1973, pg. 74 Print.

[14] Ibidem, pg.74.

[15] The reversibility between seeing and seen in cinema can also take a different form, and this happens when the actor addresses directly the spectator by looking at the camera and thus breaks the surface of the screen outward, generating estrangement rather than embodiment.

[16] The early universe. cern.com. Web.

[17] Spins is an expression coming from the field of physics which refers to the angular moment or deflection of a certain particle passing through a magnetic field. Cfr. Cern.com. Web.

[18] This attempt constitutes the proper philosophical contribution of the on-going research I’m developing as a Ph.D. candidate at Aarhus university, and it is mainly grounded on the work of Mark Hansen, Bernard Stiegler and Brian Massumi. In fact, the main argument of the research consists in arguing that a notion of POV is much needed in Hansen’s attempt to rethink phenomenology in relation to 21 century media, as much as in both Stiegler’s and Massumi’s philosophies of techno-time and affects.

[19] Cfr. Ricoeur, P. Husserl. An analysis of his phenomenology. Evanston: Northwestern University Press, (1967), 2007. Print

[20] Emmeche, C. “Can robots have an Umwelt”. Semiotica vol. 134 (issue 1/4): pp. 653-693; 2001.

[21] Cfr. Uexküll, Jakob von (1940). Bedeutungslehre. (Bios 10. Johann Ambrosius Barth, Leipzig), [translated by Thure von Uexküll, 1982: The theory of meaning. Semiotica 42(1): 25-82; glossary p. 83-87 by T.v.U.].

[22] Cfr. Uexküll, J. von. “A Stroll Through the Worlds of Animals and Men: A Picture Book of Invisible Worlds”. In Schiller Claire H. Instinctive Behavior: The Development of a Modern Concept. New York: International Universities Press, 1957. Print.

[23] Cfr. Virilio, P. The vision machine. Bloomington: Indiana University Press, 1994. Print.

[24] Ibidem.

[25] Deleuze, G. Cinéma 1. L’image-mouvement. Paris: Minuit, 1983. Print.

[26] Ibid., pg.61.

[27] Ibidem, pg.62.

[28] Ibidem.

[29] Ibid., pg.65.

[30] In my research I also argue that POV can be approached as the inner genetic element of cinema, while in this new framework the history of cinema can be seen as the very attempt to erase the presence of POV (via tripods, cranes, dollies and more) as the phenomeno-aesthetic figure at the foundation of cinema itself.

[31] Virilio, P.  The Vision Machine. Bloomington: University of Indiana Press, 1994. Print.

[32] Clarke, R. “Information Technology and Dataveillance”. Commun. ACM 31,5 1988: pp. 498-512. Web.

Yar, M. “Panoptic Power and the Pathologisation of Vision: Critical Reflections on the Foucauldian Thesis”.

Surveillance & Society 1(3) 2003: 254-271. Web.

[33] Cfr. Merleau-Ponty, M. Phénoménologie de la perceptionParis: Gallimard, 1945. Print.

[34] Cfr. Hansen, B. N. M. Feed-forward. Chicago: University of Chicago Press, 2015, pg. 266. Print.

[35] Ibidem.

[36] Cfr. Hansen, M. B. N. New philosophy for new media. Boston: MIT, 2004. Print.

[37] Cfr. Massumi, B. The autonomy of affect. Cultural Critique, No. 31, The Politics of Systems and Environments, Part II. (Autumn, 1995), pp. 83-109. Print.

[38] This is what happens in relation to the creation of POV data-doubles and the consequent formation of filter bubbles based on the prehension of users’ affects and desires. I’ve started to work on this topic specifically in relation to the emergency of a new type of selfie aesthetic in a paper recently published on APRJA. Cfr. Azar, M. “The Algorithmic Facial Image (AFI) and the relation between truth value and money value”. APRJA, June 2018. Web. Another example of these forms of prehension is a new MIT prototype that allow users to control basic functions of a computer through an ergonomic wearable interface able to record the micro-movements of the subject’s lower jar as a way to infer brain activity – the jar move slightly when the brain formulate a decision even without the production of a verbal utterance – and certainly before the awareness of the subject. “Electrodes on the face and jaw pick up otherwise undetectable neuromuscular signals triggered by internal verbalizations”. Herdesty, L. MIT News Office, “Computer system transcribes words users “speak silently”. News.mit.edu. Web.

[39] Cfr. Mechanical Turks. Wikipedia. Web.

[40] Cfr. Supervised Learning, Wikipedia. Web.

[41] Latent space is technically defined as the space where a “generative network learn to map […] a particular data distribution of interest, while the discriminative network discriminates between instances from the true data distribution and candidates produced by the generator”. Cfr. Generative Adversarial Network. Wikipedia. Web.

[42] Pasquinelli, M. “Machines that Morph Logic: Neural Networks and the Distorted Automation of Intelligence as Statistical Inference”. Glass Bead, 1: “Logic Gate: The Politics of the Artifactual Mind”, 2017. Web: http://www.glass-bead.org/article/960.

[43] http://quasimondo.com/.


  1. Hi Mitra, thanks so much for your piece! I’m looking forward to speaking with you about adversarial neural networks at the workshop in a few weeks, as I’ve done a bit of writing on this too.

    My question is about the relationship between machinic perception and POV. POV is an embodied way of sensing, where, as you say, the character’s perspective overlaps with that of the camera and the spectator. So POV is about vectors of vision. Machinic perception, on the other hand, can be “sightless”, as Paul Virillio writes in the later parts of Vision Machine. Have you considered the ways that machinic perception may eclipse the terrain of sight entirely, to become a POV that is not tied to regimes of sight, but to some other form of sensing and sense-making?


  2. Dear Carleigh, thx for your interesting comment. It made me think about some stuff i wasn’t completely aware, and it has moved something

    POV is an acronym in common between POV-matter and POV-apparatus, so it should be a feature consistent with both regimes. In this sense, POV is only about orientation, and not embodiment, because this feature applies to both POV-matter and POV-apparatus. From here, POV can turn into embodiment in the case of organic POV-matter such as human beings and can turn into a techno-aesthetic feature proper of POV-apparatus (or POV technologies of vision) able to produce a feeling of embodiment / disembodiment into viewers / users.

    POV as an embodied ways of seeing comes into being with cinema, where POV is a technical format capable of producing for the first time the seamless overlapping between the human and the technological. So cinematic POV is about a certain articulation of the body with technology, an articulation that to me frames 21 century POV technologies of vision and their aggression on the human body. This aggression is not only visual. As you suggest, machines tracks a number of different senses and not only through visual means (and maybe not even mainly, or at least not only and for ever mainly visual, seeing current technological trends), and attempt at accessing affects. Affects are what fill the gap between action and reaction “without filling it in or filling it up” (Deleuze), where POV emerges as the interval between an action and a reaction and as a locality where affects can express themselves. This gap machines are attempting at capturing is of the same type of the ones appearing during the formation of the first “centers of indetermination” emerging from the pre-biotic soup after the inflation that follows the Big Bang. Back then, POV wasn’t yet visual, and it was simply the emergency of a directionality in an a-centered (pre)vectorial as you well said space composed by POV-matter and fields.

    Not sure at the moment, though, at what stage we can start talking of orientation as embodied or even when the tendency towards embodiment can be located in the transformation between inorganic and organic better – it’s an interesting question i will have to put some time into. In anycase, POV becomes as well the way or the form in which this gap – and the related affects – are harnessed. How machinic POV harnesses the interval or even is the interval present in machinic POV vicariously, as the production of a POV-data selfie built by machinizing the affect, the missing half second of perception, is also a good question to ask i think. Finally, isn’t affect itself already the machine in the man – the automatism, acting independently, as a desubjectivating force within human being itself, in the wording of Brett in his paper referring to Guattari? If machine is automatism, automatism in human is affect. Hence the interesting phase we’re currently in with machinic automatism trying to grasp affective automatism. sorry for long reply, and looking forward to talk more!


Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s