Twitter Detox

I have been taking some time off recently. I don’t really want to talk about the circumstances surrounding it because that isn’t the point of this post.

Since I have had time to work on slides and projects, I figured I would be super productive. I have been kind of productive, but not as much as I would like.

I have noticed I have no attention span. It’s gotten worse over the last two years or so.

I was on my flight to CocoaConf San Jose and I nearly had a panic attack that I couldn’t tweet the four hours or so I was on the plane. When I go to take a bath I bring a bunch of books and even video games to consume in the tub but I wind up spending all of my time on my phone chatting with people on Twitter. I get bored playing video games because they don’t move fast enough and that revelation really disturbs me quite a lot. I used to cross stitch and doing anything for more than a few minutes at a time is incredibly difficult.

There has been some question about the rise in cases of ADHD. I think that some people, like my teacher Eric Knapp, were born with it. But I also think that we can train our brains to mimic symptoms of it and that the massively connected world we live in has not been good for this.

I will wake up in the middle of the night and not be able to get back to sleep until I check Twitter and my email. This is really unhealthy.

I have no idea when this started getting really bad, but this is the first time I have noticed it.

Twitter has been really good at helping me establish and maintain connections with people I might only meet once at a conference. I know that a lot of people have gotten to know me and my pugs through my tweets.

I am just at a point where I am concerned about my long-term ability to function and be productive and I would like to break this cycle of thinking.

So I am going to delete the Twitter app from my phone. I am going to close the tab I have open for Twitter on my laptop.

I will answer email. There should be enough places online for people to find my email without me having to put it in this post. I will be on Slack channels. I will try to write on my blog more. I am not trying to make myself unreachable, I just need to disconnect from The Matrix for a while and relearn how to pay attention and focus so I can do things I think are important.

At this point I am planning to do this until the beginning of May. I want to give myself a good detox period.

I think that by doing this I will spend less time on my computer when I am not working. I will be more productive when I am on my computer. I think that I waste hours each day just chatting with people on Twitter and if that outlet isn’t there anymore I will be less likely to zone out in front of my computer for hours each night.

I hope that my decision is respected and I plan to write about my observations of the changing state of my focus on this blog. I am interested in seeing how my focus changes over the next month or so.

Thank you.

Caesar Cipher

I have been trying to do more programming exercises recently. One thing that I think is incredibly important when you are an engineer is to have an understanding of the problem you are trying to solve. If you’re not solving a problem, you wind up with Clever Code, and that’s just bad for everyone.

I have a casual interest in cryptography. It’s one of those things that always fascinated me but I never realized I could actually learn and apply to things.

I couldn’t sleep last night and I looked around for a book to read. I found my copy of The Code Book by Simon Singh.

I found a bunch of really interesting cryptographic methods that looked like they would be interesting code projects.

I am going to go through and do some of these in order of complexity and age. The older the method, the simpler it is. This makes sense because new cryptographic algorithms are only created in response to the most recent one being cracked.

I will be adding all of my projects to a Github repository.

I’ll probably hit a wall at some point where I can’t continue to do this, but for now I have a few that I can do and I am looking forward to figuring out how to implement these.

Working Caesar Cipher

Working Caesar Cipher

What is a Caesar Cipher?

A Caesar Cipher is an encryption technique where each letter of the alphabet is shifted by a certain offset and all the letters are substituted.

For example, if your offset is 3, then every time you have an “a” in your text that you are encrypting, it would be replaced by a “d.” “b” would be replaced by “e” and so on. Once you get to the end of the alphabet, you start at the beginning. “z” would be replaced by “c”, “y” would be replaced by “b”, and so forth.

This cipher was used for several hundred years, but it was vulnerable to being easily broken. In fact, there is a daily cryptographic puzzle that no one ever looks at published in the paper next to the comics.

Even though this specific cipher is fairly easy to break, it provides the basis for some more complex ciphers being used, which is why I am starting out with this one. I can build off of this one to work on more complicated problems.

What are our Requirements?

In order to implement a solution for this problem, it’s helpful to think through what this application has to do.

  1. Take in a string
  2. Choose an offset
  3. Check each letter in the string
  4. Shift the letter by the offset
  5. Return an encrypted string

The first part is easy. We simply use the built-in UI elements in iOS to create a text field that can contain the input text and the encrypted string.

Since we’re dealing with an offset where one letter will be substituted for another, it mades sense to use arrays. There should be a constant array that simply contains the entire alphabet with each letter constituting an element.

The second array is a little more tricky. We need to shift the elements in the array by a variable offset. You use the offset to figure out what the first element in the new array is.

After you get both arrays, you find the index of each letter in the first array and replace it with the correlating letter in the second array.

Implementation

I like to keep as much code as possible out of the view controller as possible, so I only have one line of code:

@IBAction func encryptText(sender: AnyObject) {
    encryptedText.text = encryptedString(inputText.text)
}

I have a helper function that takes the text input as a parameter and outputs the encrypted string and some other parts to the text output. Everything else is in a separate file of stand alone functions.

First thing I wanted to create was my immutable alphabet array:

let alphabet:[Character] = ["a", "b", "c", "d", "e", "f", "g", "h", "i", "j", "k", "l", "m", "n", "o", "p", "q", "r", "s", "t", "u", "v", "w", "x", "y", "z"]

Next up, I need my offset. I could make that something the user inputs, but it was easier to just have it be a randomly generated number:

func randomNumber() -> Int {
    return Int(arc4random_uniform(25) + 1)
}

I want this to return a number between one and twenty-five. Since the array starts at zero and a zero offset means no encryption, it makes sense to have the lowest number be a one. Since the alphabet is always going to be 26 characters long, having a hard-coded value of twenty-five makes sense in this scenario.

I would like to unit test this method, but I don’t know how to do that. Since the output is random, I don’t really know how to generate a useful unit test. At this point I have to just have faith that this piece of code won’t return a number that is outside of the index of my array.

I don’t like leaving things to chance, so in my code to create my encryption array I deal with this possibility to avoid making my app crash:

func encryptArray(var offset:Int) -> [Character] {
    
    if offset < 1 {
        offset = 1
    }
    
    if offset > 25 {
        offset = 25
    }
    
    var array = alphabet
    let slice = array[0.. < offset]
    array.removeRange(0.. < offset)
    array.appendContentsOf(slice)
    
    return array
    
}

The last few lines of code are copying the range at the beginning of the array, hacking it off of the front, the pasting it to the back.

I feel like there should be a more efficient way of doing this, but it works for now, so huzzah.

One of the things I really wanted to do was figure out a way to use map(). So far I have not found a good reason to use it. I have seen clever code from people that is longer than mine but more functional. I really wanted to find a situation where mapping a function made sense. Generating an array of characters that are offset by a certain fixed amount from another array of characters fits this to a tee.

Before I can use map(), I need to set up the function that is going to be mapped to the array:

func encryptCharacter(input:Character, encryptedArray:[Character]) -> Character {
    
    if alphabet.contains(input) {
        let index = alphabet.indexOf(input)
        return encryptedArray[index!]
    } else {
        return input
    }
}

One case I had to handle was if the character was not a letter. We’re dealing with spaces, commas, periods, etc… I do a check to make sure that the array contains the character being passed in. If it doesn’t, then it is one of the other characters we are returning directly.

Now that we have our processing function, and all of our other functions, for that matter, it’s time to pull it all together:

func encryptedString(stringToEncrypt:String) -> String {
    let lowercaseStringToEncrypt = stringToEncrypt.lowercaseString
    let offset = randomNumber()
    let encryptionArray = encryptArray(offset)
    let array = [Character](lowercaseStringToEncrypt.characters)
    
    let encryptedArray = array.map({encryptCharacter($0, encryptedArray: encryptionArray)})
    
    let encryptedString = String(encryptedArray)
    
    return "(encryptedString)
 The offset was (offset).
 (alphabet[0]) = (encryptionArray[0])"
}

This isn't working quite right...

This isn't working quite right...

The first time I ran this project, I realized that my analysis is case sensitive. I had letters that would fall through and be returned directly because the algorithm did not catch that they were letters. I resolved this by forcing all of the characters to be lowercase so that they would be processed. I think I could have made another array of upper case letters and add more complexity to the encryption process, but I didn’t feel like it. This was a simple fix.

Capital letters don't get no respect.

Capital letters don't get no respect.

I am creating variable to hold the results of all of my helper functions. Where necessary, I am passing in the results of some of these function to the functions that follow. We have a nice line of parts that all connect to one another in only one way.

In order to pass each character in, I need to create an array of characters from the string input in the view controller. There is a nice handy function that makes this process somewhat easy.

I was able to effectively use my nice map() function to process each character in the array of characters.

After going through and encrypting each character in the array, we can bring it back together again in a String.

Lastly, I wanted to include some information about the encrypted string beyond just the string. I wanted to include the offset and show what the first element in the encrypted array was.

Conclusion

I am not certain how to write unit tests for two of my stand alone function. They have randomly generated outputs. I think there must be a way to prove these work, even if it isn’t with unit tests. If anyone has suggestions about how to proceed with this I would appreciate it.

Overall, this was a nice little project to work on before heading off to RWDevCon. I have a larger independent project I would like to work on that I didn’t want to start when I would be gone for a week.

I find encoding and data processing to be fascinating. I would like to work on more stuff like this, even if it is just for my blog. I am hoping to have a more complicated encryption project on here soon.

How to Add A New Shader to GPUImage

Last year I wrote this article about image processing shaders.

I explain how several shaders in GPUImage work and what the point of a vertex and a fragment shader are. But since this has come out, I have realized there is a lot more to this than I was able to go into in this article.

In spite of the fact that I worked for Brad Larson for a year, I never wrote a shader for GPUImage. When I started trying to figure it out, it became incredibly intimidating. There was a lot of stuff going on in the framework that I was less than familiar with. I also knew there was a lot of stuff he and I spoke about that was not really made available to the general public in the form of documentation.

So, in this blog post, I would like to extend that post somewhat by talking specifically about GPUImage rather than shaders in the abstract. I want to talk about the internal workings of GPUImage to the point that you understand what components you need to connect your shader to the rest of the framework. I want to talk about some general conventions used by Brad. Lastly, I would like to talk about how to construct complex shaders by combining several more primitive shaders.

I am sending this blog post to Brad for tech reviewing to ensure that everything in this post is factually correct and to ensure that I am not propagating any false information. If there is something I am missing here, I would appreciate having someone reach out to me to ask about it so I can include it here.

Fork GPUImage

If you want to contribute to GPUImage, you must create a fork of the repository. I am actually using this post to recreate all of my work because I didn’t fork the repo, tried to create a branch, and it was turtles all the way down.

I had a fork that was two years old and 99 commits behind the main branch. I am only using my forks to create projects to contribute to the framework, so I destroyed my old fork and created a new one. If you’re better at Git than I am and can keep pulling in the changes, then awesome. My Git-fu is weak so I did this, which is probably making a bunch of people sob and scream “WHY?!?”

Adding Your New Filter to the Framework

There are two different instances of the GPUImage framework: Mac and iOS. They both share a lot of the same code files, but you need to add the new filter to both manually.

My Solarize filter is one that modifies the color, so I dragged and dropped my filter into the group of color modifying shaders. The source files go into the Source folder in the Framework folder in the file you clone from GitHub.

Where the shader files go

Where the shader files go

While you are looking at your .h file, you need to make sure that it is set to Public rather than Project. If you don’t do that, it won’t be visible for the next step in this process.

This is the wrong privacy setting for the header file.

This is the wrong privacy setting for the header file.

This is the correct privacy setting

This is the correct privacy setting

Next, you need to go to the Build Phases of the GPUImageFramework target of your project. You need to add the .h file to the Headers in the framework and the .m file to the Compile Sources.

I had some trouble doing this. When I would click on the “+” button at the bottom of the filter list, I would only be able to find the opposite file. So for example, when I tried to add my header to the Headers list, I would only be able to find the solarization implementation file. The way I got around this was to drag the file from the list and drop it into the build phases target.

Drag the file from the left to the correct build phase on the right.

Drag the file from the left to the correct build phase on the right.

Lastly, you need to add your filter to the GPUImage.h header file so that the framework has access to your shader. (By the way, wasn’t C/Objective-C a pain??)

The GPUImage.h is different for the Mac and the iOS project, so if you want your filter to be available on both platforms, you need to remember to add it to both GPUImage.h header files.

Adding Your New Filter to the Filter Showcase

First off, want to warn you to only use the Swift version of the Filter Showcase. I had a lot of stuff that I was adding in here to try and get the Objective-C version working. That was the main reason I split this into two blog posts. It is a pain in the ass and it was made far simpler in the Swift version of the filter showcase.

When you open the Filter Showcase, you will be asked if you want to update to the recommended project settings. Don’t do this!

The only place you need to make a change is in the file FilterOperations.swift. This file is shared between both the iOS and the Mac Filter Showcase apps, so you only have to change this once. Huzzah!

You need to add your new filter to the filterOperations array. There are a few things you need to set in your initialization:

FilterOperation (
        listName:"Solarize",
        titleName:"Solarize",
        sliderConfiguration:.Enabled(minimumValue:0.0, maximumValue:1.0, initialValue:0.5),
        sliderUpdateCallback: {(filter, sliderValue) in
            filter.threshold = sliderValue
        },
        filterOperationType:.SingleInput
    ),

I looked at the parameters used in the GPUImageLuminanceThresholdFilter initialization because my shader was primarily based on this code. As such, your filters might have different parameters than mine did. Look around at other instances to get an idea of how your filter should be initialized.

When you build and run the Filter Showcase, you might encounter an issue where the project builds but doesn’t run. You might see this pop-up:

Update the what now??

Update the what now??

If this happens, don’t update anything. Xcode is trying to run the wrong scheme. Look up at the top of Xcode near the “Play” button and check the scheme. It should say Filter Showcase. If it says GPUImage instead, then you need to change the scheme and it should work okay.

Wrong schemes are full of fail.

Wrong schemes are full of fail.

GPUImage Style Guide

I am going to go through my Solarize filter to make note of some style things that you should keep in mind when you are writing your own filters to try and keep things consistent.

The entire implementation file for shader is represented in this section, but I am showing a chunk at a time and explaining each part.

#import "GPUImageSolarizeFilter.h"

As with all C programs, you need to import the header for your shader in the implementation file.

#if TARGET_IPHONE_SIMULATOR || TARGET_OS_IPHONE
NSString *const kGPUImageSolarizeFragmentShaderString = SHADER_STRING
(
 varying highp vec2 textureCoordinate;
 
 uniform sampler2D inputImageTexture;
 uniform highp float threshold;
 
 const highp vec3 W = vec3(0.2125, 0.7154, 0.0721);
 
 void main()
 {
     highp vec4 textureColor = texture2D(inputImageTexture, textureCoordinate);
     highp float luminance = dot(textureColor.rgb, W);
     highp float thresholdResult = step(luminance, threshold);
     highp vec3 finalColor = abs(thresholdResult - textureColor.rgb);
     
     gl_FragColor = vec4(finalColor, textureColor.w);
 }
);

Since GPUImage is cross platform, you need to check if your shader is running on an iOS device or not. Since iOS uses OpenGL ES and not just plain OpenGL, there are a slight difference that you need to take into consideration.

Notice how a bunch of the variables are described as highp. In OpenGL ES, since you have more limited processing power, if you don’t need a lot of precision, you can optimize your code by lowering the necessary precision. You do not do this in the Mac version of the shader:

#else
NSString *const kGPUImageSolarizeFragmentShaderString = SHADER_STRING
(
 varying vec2 textureCoordinate;
 
 uniform sampler2D inputImageTexture;
 uniform float threshold;
 
 const vec3 W = vec3(0.2125, 0.7154, 0.0721);
 
 void main()
 {
     vec4 textureColor = texture2D(inputImageTexture, textureCoordinate);
     float luminance = dot(textureColor.rgb, W);
     float thresholdResult = step(luminance, threshold);
     vec3 finalColor = abs(thresholdResult - textureColor.rgb);

     gl_FragColor = vec4(vec3(finalColor), textureColor.w);
 }
);
#endif

One last thing I want to point out is the use of NSString *const kGPUImageSolarizeFragmentShaderString = SHADER_STRING. It is a convention to take the name of your shader, add a “k” at the beginning (“k” for “constant”, get it?), and follow it with “FragmentShaderString.” Use this convention when you write your own shaders.

In each shader that you create for GPUImage you need to remember to include the textureCoordinate and the <>. These two variables receive the exact pixel you are going to process, so make sure you have those in each of your shaders.

Now we move on to the implementation:

@implementation GPUImageSolarizeFilter;

@synthesize threshold = _threshold;

I have a public facing variable, threshold, that I need to get access to in order to implement the shader, so it needs to be synthesized. Again, this is a throw back from C/Objective-C that you might not be familiar with if you just started with Swift.

#pragma mark -
#pragma mark Initialization

- (id)init;
{
    if (!(self = [super initWithFragmentShaderFromString:kGPUImageSolarizeFragmentShaderString]))
    {
        return nil;
    }
    
    thresholdUniform = [filterProgram uniformIndex:@"threshold"];
    self.threshold = 0.5;
    
    return self;
}

All Vertex shaders will look similar to this. They all have an initializer that attempts to initialize the fragment shader from that string you created back at the beginning of the if/else statements.

If you have a public facing variable like threshold is here, you need to set that up before you finish your initialization.

#pragma mark -
#pragma mark Accessors

- (void)setThreshold:(CGFloat)newValue;
{
    _threshold = newValue;
    
    [self setFloat:_threshold forUniform:thresholdUniform program:filterProgram];
}


@end

Lastly, if you have a public facing variable, like we do with the threshold, you need an accessor for it.

Copy/Paste Coding

I normally really do not condone “Copy/Paste” coding, but did want to mention that there is a lot of repetitive stuff in all of the shader programs. This shader only has like five lines of code that I changed between the Solarize filter and the Luminance Threshold filter.

Generally speaking, copying and pasting code is bad, but looking through a piece of code and figuring out what everything does while you are learning and being able to process how to do this on your own in the future isn’t always the worst thing in the world. I could have written my shader without having an understanding of how the two contributing filters worked.

So, use these as learning materials and there’s nothing wrong with looking at how they’re put together until you feel comfortable with the process on your own.

Wrapping Up

I know that trying to start something unfamiliar can be really intimidating. Especially if there are a bunch of esoteric steps that you’ve never done before and you don’t want to have to ask about it because you feel stupid.

I went back and forth with Brad a dozen times while writing this blog post and none of my questions were about the actual shader code. It was all about how to add this to the framework, why my shader wasn’t showing up in the header, etc…

This stuff is not intuitive if you haven’t done it before. This might be familiar to older Mac programmers who had to do this stuff all the time, but if you primarily work with Swift where everything is just there and you don’t have to add things to build phases, then this can be extraordinarily frustrating.

I hope that this is helpful to those who have wanted to contribute to GPUImage but got frustrated by the hoops that were necessary to jump through to get something working. I hope that this means that people have a resource besides Brad to get answers to common questions that aren’t really available on Stack Overflow.

How to Write a Custom Shader Using GPUImage

One of my goals over the last year or so was to do more coding and to specifically do more GLSL programming. You learn best by actually working on projects. I know that, but I have had some trouble making time to do things on my own.

I want to go over my thought process in figuring out a filter I wanted to write, how I was able to utilize the resources available to me, and to hopefully give you an idea about how you can do the same.

You can clone GPUImage here.

Solarize Image Filter

One thing I had trouble with in GPUImage is the fact that it is too comprehensive. I couldn’t think of an image filter that wasn’t in the framework already.

So I started thinking about Photoshop. I remembered there was a goofy filter in Photoshop called Solarize.

Since I knew that Brad was more concerned with things like edge detection and machine vision, I figured that it would not have occurred to him to include a purely artistic filter like that. Sure enough, there was no solarization filter. Jackpot!

After figuring out something that wasn’t there, the next question was how to create one. I initially was going to use this Photoshop tutorial as a jumping off point, but I wondered if there was a better way. I Googled “Solarize Effect Algorithm” and I found this computer science class web page that gave an agnostic description of how the effect works.

What is Solarization?

Solarization is an effect from analog photography. In photography, one important component of a photograph is its exposure time. Photographers used to purposely overexpose their photos to generate this effect. When a negative or a print is overexposed, parts of the image will invert their color. Black will become white, green will become red.

There is a threshold where any part that passes that threshold will receive the effect.

One of the things Brad told me about GPUImage was that many of the filters in GPUImage are composed of many smaller filters. It’s like building blocks. You have a base number of simple effects. These effects can be combined together to generate more and more complex effects.

Looking at the algorithm for solarization, I noticed it requires two things:

  • An adaptive threshold to determine what parts of the image receive the effect
  • An inversion effect on the pixels

I opened up GPUImage to see if there were already filters that do those things. Both of these functions already exist in GPUImage.

Now I needed to figure out how they work so that I could combine them into one, complex filter.

GPUImageColorInvertFilter

Since the color invert filter is the simpler of the two filters, I will be looking at this one first.

Since this is a straightforward filter that does one thing without any variables, there are no publicly facing properties in it’s header file.

Here is the code for the vertex shader that we see in the implementation file:

NSString *const kGPUImageInvertFragmentShaderString = SHADER_STRING
(
 varying highp vec2 textureCoordinate;
 
 uniform sampler2D inputImageTexture;
 
 void main()
 {
    lowp vec4 textureColor = texture2D(inputImageTexture, textureCoordinate);
    
    gl_FragColor = vec4((1.0 - textureColor.rgb), textureColor.w);
 }
);

These two lines will exist in every vertex shader in GPUImage:

varying highp vec2 textureCoordinate;
uniform sampler2D inputImageTexture;

These lines are the lines where you are bringing in the image that is going to be filtered and the specific pixel that the fragment shader is going to work on.

Let’s look at the rest of this:

void main()
{
   lowp vec4 textureColor = texture2D(inputImageTexture, textureCoordinate);
   gl_FragColor = vec4((1.0 - textureColor.rgb), textureColor.w);
}

The first line is simply obtaining the specific RBGA values at the specific pixel that you are currently processing. We need that for the second line of code.

The gl_FragColor is the “return” statement for the fragment shader. We are returning what color the pixel will be. Texture colors are part of a normalized coordinate system where everything exists on a continuum between 0.0 and 1.0. In order to invert the color, we are subtracting the current color from 1.0. If your red value is 0.8, the inverted value would be (1.0 - 0.8), which equals 0.2.

This is a simple and straight forward shader. All of the actual, shader specific logic is in the gl_FragColor line where we are doing the logic to invert the colors.

Now let’s move on to the more complicated shader.

GPUImageLuminanceThresholdFilter

The Luminance Threshold Filter is interesting. It checks the amount of light in the frame and if it is above a certain threshold, the pixel will be white. If it’s below that threshold, the pixel will be black. How do we do this? Let’s find out.

The Luminance Threshold Filter, unlike the color inversion filter, has public facing properties. This means that the header file has some actual code in it that we need to be aware of. Since this one is interactive and depends upon input from the user, we need a way for the shader to interface with the rest of the code:

@interface GPUImageLuminanceThresholdFilter : GPUImageFilter
{
    GLint thresholdUniform;
}

/** Anything above this luminance will be white, and anything below black. Ranges from 0.0 to 1.0, with 0.5 as the default
 */
@property(readwrite, nonatomic) CGFloat threshold; 

@end

Our threshold property will receive input from a slider in the UI that will then set a property we will need to calculate our shader.

NSString *const kGPUImageLuminanceThresholdFragmentShaderString = SHADER_STRING
( 
 varying highp vec2 textureCoordinate;
 
 uniform sampler2D inputImageTexture;
 uniform highp float threshold;
 
 const highp vec3 W = vec3(0.2125, 0.7154, 0.0721);

 void main()
 {
     highp vec4 textureColor = texture2D(inputImageTexture, textureCoordinate);
     highp float luminance = dot(textureColor.rgb, W);
     highp float thresholdResult = step(threshold, luminance);
     
     gl_FragColor = vec4(vec3(thresholdResult), textureColor.w);
 }
);

This has a few more components than the color inversion code. Let’s take a look at the new code.

uniform highp float threshold;
const highp vec3 W = vec3(0.2125, 0.7154, 0.0721);

The first variable is the value we are receiving from the user for our threshold. We want this to be very accurate because we’re trying to determine whether a pixel should receive an effect or not.

The second line has what looks like a set of “magic numbers.” These numbers are a formula for determining luminance. There are explanations for it here and in the iPhone 3D programming book by Philip Rideout. We will use this constant to map over the current value of the fragment in the next bit of code.

void main()
{
    highp vec4 textureColor = texture2D(inputImageTexture, textureCoordinate);
    highp float luminance = dot(textureColor.rgb, W);
    highp float thresholdResult = step(threshold, luminance);
     
    gl_FragColor = vec4(vec3(thresholdResult), textureColor.w);
}

The first line does the same thing as it did in the color inversion filter, so we can safely move on to the next line.

For the luminance variable, we’re encountering our first function that doesn’t exist in C: dot(). dot() takes each red, green, and blue property in our current pixel and multiplies it by the corresponding property in our “magic” formula. These are then added together to generate a float.

I have tried hard to find a good explanation of why dot product exists and what it does. This is the closest thing I can find. One of my goals with this series of blog posts is to take things like dot product where you can explain what it does, but not why you are using it and what functionality it fulfills. For now, hopefully this is enough.

Next up, we have the threshold result. This is where we are doing our only conditional logic in the shader. If you recall, this shader makes a determination if each pixel should be white or black. That determination is being made here.

The step() function evaluates two different floats. If the first number is larger, then the threshold result is 1.0 and that particular pixel is bright enough to pass the threshold requirements. If the second number is larger, then the pixel is too dim to pass the threshold and the result is 0.0.

Finally, we map this result to our “return” statement, the gl_FragColor. If the pixel passed the threshold, then our RGB values are (1.0,1.0,1.0), or white. If it was too dim, the then RGB values are (0.0,0.0,0.0), or black.

GPUImageSolarizeFilter

According to the algorithm that describes the solarization process, you use a luminance threshold to determine if a pixel receives an effect or not. Instead of the effect of making the pixel either white or black, we want to know if the pixel should be left alone or if it’s color should be inverted.

The only line of code that the color inversion filter has that the threshold doesn’t have is a different gl_FragColor:

// Color inversion gl_FragColor
gl_FragColor = vec4((1.0 - textureColor.rgb), textureColor.w);

// Luminance Threshold gl_FragColor
gl_FragColor = vec4(vec3(thresholdResult), textureColor.w);

I am embarrassed to say how long it took me to figure out how to combine these two filters. I thought about this for a long time. I had to think through all of the logic of how the threshold filter works.

The threshold filter either colors a pixel black or white. That is determined in the thresholdResult variable. This means that we still need the result in order to figure out if a pixel receives an effect or not, but how do we modify it?

Look at this part of the gl_FragColor for the color inversion:

(1.0 - textureColor.rgb)

Where else do we see 1.0 being used in the threshold shader? It’s the result of a successful step() function. If it fails, you wind up with 0.0. We need to change the gl_FragColor from the color inversion filter to have the option to return a normal color or an inverted color. If we subtract 0.0 from the texture color, then nothing changes.

This is the final code I came up with to implement my Solarize Shader:

NSString *const kGPUImageSolarizeFragmentShaderString = SHADER_STRING
(
 varying highp vec2 textureCoordinate;
 
 uniform sampler2D inputImageTexture;
 uniform highp float threshold;
 
 const highp vec3 W = vec3(0.2125, 0.7154, 0.0721);
 
 void main()
 {
     highp vec4 textureColor = texture2D(inputImageTexture, textureCoordinate);
     highp float luminance = dot(textureColor.rgb, W);
     highp float thresholdResult = step(luminance, threshold);
     highp vec3 finalColor = abs(thresholdResult - textureColor.rgb);
     
     gl_FragColor = vec4(finalColor, textureColor.w);
 }
);

This is primarily composed of the same code to create the luminance threshold shader, but instead of mapping each pixel to be either black or white, I am using that result to check to see if I am inverting my colors. If the colors need to be inverted, then the thresholdResult is 1.0 and our formula moves forward as usual. If thresholdResult is 0.0, then our texture color remains the same, except now it is negative, which is why I wrapped it in an abs() function.

Completely solarize shader. Will try to get a better picture later.

Completely solarize shader. Will try to get a better picture later.

Takeaways

One of the big things I keep harping on with this blog the last year or so is that to be a good engineer, you must have an understanding about what you’re trying to accomplish.

Breaking down a shader that you want into an algorithm helps you to tease out what pieces of functionality you need to get the result you want. In a lot of cases, the smaller pieces of functionality are already out there in the world somewhere and you can reuse them to build more complex things.

You also need to be able to read through those shaders to figure out how they work so you can know where the pressure points are to change the code. Since shaders are so small, you can usually take one line at a time and break down what it does. If there is a function being used that you’re unfamiliar with, Google it. You don’t always need to understand all of the why something is being used in order to implement it, such as I did with the dot() function. I was able to have a good enough grasp of what it did to understand why it was needed in my shader which was all I really needed.

This stuff can be intimidating, which is why it’s important to spend some time figuring out why something works and, more importantly, figuring out what YOU want to do with it.

I will be following this post up tomorrow with instructions for how to add a shader to the GPUImage framework, including some other parts of the shader code that I did not go over here.

Exercises for Programmers: Exercise 4- Mad Libs

The fourth exercise in Exercises for Programmers is to create a Mad Lib.

Mad Libs are a game you play when you’re a kid and you’re trying to learn different parts of speech. You name random nouns and verbs to fill in the blanks in a story you can’t see until you are finished.

This project continues with the theme of making the exercises more and more complex as you go along.

This project requires you to ask for four different user inputs and to create a story around them. If any of those inputs is missing, you want to alert the user rather than output a story with half the words missing.

As usual, I have uploaded this project to GitHub.

What this looks like when successful

What this looks like when successful

Data Structures

Since I want to unit test my application, I want to move as much of the programming logic outside of the view controller as possible. I need to grab the four user input responses and I want to pass those into another function.

I thought about making this into an array of strings, but I liked the idea of being able to have keys and values for the user input. I want to be able to tell the user which field they forgot to fill and that is a lot easier when you bundle the name of the field along with the input instead of just counting on things being in the right order. Having an array of random words with no indication about what they represent seemed like a recipe for disaster.

In the @IBAction for the Enter button, I created a dictionary of all of the user input values:

let userInput:[String:String] = [
    "noun":nounTextField.text!,
    "verb":verbTextField.text!,
    "adjective":adjectiveTextField.text!,
    "adverb":adverbTextField.text!
]

Now that I had a data structure and I knew what types were going into it, I could start working on the output logic.

Set up

Set up

To Map(), or not to Map()

One reason I wanted these values to exist in a data structure was because I wanted to use Map().

This seemed like a good time to use Map(). I had to iterate through a data structure, check to see if the input was empty, and if it was I had to do something.

Taking this approach really tripped me up and cost me a lot of time.

When I worked with Brad at SonoPlot, he used protocols and flatmaps to iterate through a collection of data we were writing to and getting from the NSUserDefaults. I thought I would set up a function that returns an optional string that would check each string to see if there was anything in it. If there wasn’t, I could return a nil and filter them out.

func isStringEmpty(input:String) -> String? {
    if input.characters.count != 0 {
        return input
    } else {
        return nil
    }
}

I started to realize this wasn’t really helpful to me. Here is why:

Let’s say my user forgets to specify a verb. I can flatmap my dictionary using this function that will give me an array with three values. I won’t know which value is missing. I can’t tell the user which value they forgot to enter.

I spent a bunch of time trying to figure out how to map my isStringEmpty() function to my dictionary. In order to try and figure out how to optimize this, I wrote it in the most inefficient manner I possibly could so I could see what code I was repeating:

func output(input:[String:String]) -> String {
    
    var outputString = String()
    
    let noun = isStringEmpty(input["noun"]!)
    let verb = isStringEmpty(input["verb"]!)
    let adjective = isStringEmpty(input["adjective"]!)
    let adverb = isStringEmpty(input["adverb"]!)
    
    if noun == nil {
        outputString += "Please enter a noun! 
"
    }
    
    if verb == nil {
        outputString += "Please enter a verb! 
"
    }
    
    if adjective == nil {
        outputString += "Please enter an adjective! 
"
    }
    
    if adverb == nil {
        outputString += "Please enter an adverb! 
"
    }
    
    if noun != nil &&
       verb != nil &&
       adjective != nil &&
       adverb != nil
    {
        outputString = "Do you (verb) your (adjective) (noun) (adverb)?"
    }
    
    return outputString
}

Clearly I am repeating my calls to isStringEmpty() for every input in my UI. This is really inefficient.

I am also constantly checking if something is nil. Since the variable I am checking has the same name as the type of word that I need to tell the user to enter, this could probably be more generic.

I optimized this from that monstrosity to the following:

func output(input:[String:String]) -> String {
    var outputString = String()
    
    for (key, value) in input {
        let checkValue = isStringEmpty(value)
        
        if checkValue.characters.count == nil {
            outputString += "Please enter a (key)! 
"
        }
        
    }
    
    if outputString.characters.count == 0 {
        let noun = input["noun"]!
        let verb = input["verb"]!
        let adjective = input["adjective"]!
        let adverb = input["adverb"]!
        
        outputString = "Do you (verb) your (adjective) (noun) (adverb)?"
    }
    
    return outputString
}

I tried for a bit to use a map() function rather than the for-in loop, but it felt just easier to do it this way. I would still like to get more comfortable using map(), but I also don’t want to create an “If you give a mouse a cookie” problem where I am adding a bunch of garbage to my code to try and force a square peg into a round hole.

If anyone can give me a good explanation about how I could have used map() more efficiently than using for-in I would greatly appreciate it. I have seen it used in other people’s code but I don’t process how I can implement it myself because I have not tried to solve a problem with it yet.

Lastly, I realized that my efforts to cram the flatmap() into the code was causing me to use the isStringEmpty() function even though there really was no point in doing so.

I condensed down all of my programming logic into one function:

func output(input:[String:String]) -> String {
    var outputString = String()
    
    for (key, value) in input {
        
        if value.characters.count == 0 {
            outputString += "Please enter a (key)! 
"
        }
        
    }
    
    if outputString.characters.count == 0 {
        let noun = input["noun"]!
        let verb = input["verb"]!
        let adjective = input["adjective"]!
        let adverb = input["adverb"]!
        
        outputString = "Do you (verb) your (adjective) (noun) (adverb)?"
    }
    
    return outputString
}
Completely empty response

Completely empty response

Limit to User Input Optimization

One thing that bothered me about this project was the fact that I couldn’t procedurally deal with every instance of pulling out the nouns, verbs, etc…

No matter what I do I have to spend four lines of code pulling out the user input and then another four lines doing something with them.

Those lines of code cause me to be bothered because I don’t like hard coding things. I was happy I could do the for-in loop because there could be forty elements or two and either way it was flexible. Having things hard coded isn’t flexible.

However, I do not think there is a way around this in this project. I believe that any time you are doing anything that touches the user interface, there is a limit to how far you can optimize it.

When I was going to write the unit tests, I had the urge to see what would happen if I sent a value that wasn’t noun or verb or if one of those was missing. I know that in this project that is impossible, but it still really bothers me because of how inefficient it feels. It had code smell. I think there is nothing I can do about it, but that won’t keep me from driving myself crazy trying to think of a way around it.

Partially empty response

Partially empty response

Takeaways

Initially, I thought this project would have a lot more programming logic than any other project before this, but it actually had the least amount.

I left my less efficient code in the project to show how much I was able to refactor the project.

I think a major takeaway that I want to bring up is the idea of rewriting your code a lot.

One issue I have had with working with other people on programming projects is that sometimes, as I did here, it’s easier for me to write some really crappy, inefficient code that just works so that I can get a better handle on how to make it better. When I am working on something and someone randomly demands to see what I have so far, it is incredibly upsetting to have to show that I did this really terrible code. There are probably ways around this, but it can be super demoralizing to show the early part of my process to someone when it is something I am still actively working on and trying to make better.

If you are a decent programmer, you will rewrite your code a few times. You will get a better idea about what problems you are trying to solve and how your current approach isn’t what you thought it would be. My final code is about a third of what it was initially because the way I thought I needed to do this was different than the way I actually needed to do it. I could have kept adding more and more junk to the code to try and force it to work, or I could change my perspective on how to approach the problem.

I firmly believe in writing bad code to get the garbage out of your head. Not every line of code needs to be a line from Shakespeare. Writing a lot of bad code helps you figure out how to write better code more quickly.