|
| 1 | +What is a tf Session?; connection to C++ backend to do computation. |
| 2 | + |
| 3 | +Why might using NumPy have high overhead?; Cost of switching back to Python (after doing e.g. matrix multiplication outside Python) every operation. especially bad if running computations on GPUs or in a distributed manner, where there can be a high cost to transferring data. -> Tf defines entire graph that runs outside Python. Python code builds graph and defines which parts of the graph should be run. |
| 4 | + |
| 5 | + |
| 6 | +What does the first dimension of x = tf.placeholder(tf.float32, shape=[None, 784]) correspond to?; Batch size. |
| 7 | + |
| 8 | +What does it mean when a dimension of shape is None? It can be of any size. |
| 9 | + |
| 10 | +Is the shape argument to placeholder compulsory?; No, it's optional. Helps debugging though. |
| 11 | + |
| 12 | +What is the difference between a tf.placeholder and a tf.Variable?; Placeholders are given when we ask tf to run computations and cannot be modified (I think). Variables can be modified by the computation. |
| 13 | + |
| 14 | +What are the tf types of model parameters?; usually tf.Variable, e.g. tf.Variable(tf.zeros([784])) |
| 15 | + |
| 16 | +Do you have to initialise variables before using them in a session?; Yes: sess.run(tf.global_variables_initializer()) |
| 17 | + |
| 18 | + |
| 19 | +Matrix multiplication in tf; y = tf.matmul(x, W) + b |
| 20 | + |
| 21 | +Categorical cross entropy loss; loss = tf.nn.softmax_cross_entropy_with_logits(labels=y_true, logits=yhat) . Then ou use e.g. step = tf.train.GradientDescentOptimizer(0.5).minimize(loss), where 0.5 is the learning rate. |
| 22 | + |
| 23 | +Gradient descent step in tf; step = tf.train.GradientDescentOptimizer(learning_rate).minimize(loss), then step.run(feed_dict={x: blah, y: blah}) |
| 24 | + |
| 25 | +Take an average in tf; tf.reduce_mean(thing_to_sum) |
| 26 | + |
| 27 | +Can you replace a variable in your computation graph with other input using feed_dict?; Yes, you can replace any tensor in your graph using feed_dict. |
| 28 | + |
| 29 | +cast booleans to floats; tf.cast(list_of_booleans, tf.float32) |
| 30 | + |
| 31 | +why should you initialise weights with a small amount of noise?; (1) symmetry breaking (todo: expand) and (2) to prevent 0 gradients. |
| 32 | + |
| 33 | +when using ReLU neurons, how should you initialise them?; with a slight positive bias (e.g. 0.1) to avoid 'dead neurons'. |
| 34 | + |
| 35 | +how might you initialise weights with a small amount of noise?; tf.Variable(tf.truncated_normal(shape, stddev=0.1)) |
| 36 | + |
| 37 | +2D convolution layer in tf; tf.nn.conv2d(x, W, strides=[1, 1, 1, 1], padding='SAME'). |
| 38 | + |
| 39 | +I want a conv layer to compute 32 features for each 5x5 patch, input having 1 channel (e.g. greyscale). What should the shape of the weights be?; [5, 5, 1, 32]. i.e. (patchdim1, patchdim2, num_input_channels, num_output_channels) |
| 40 | + |
| 41 | +specify shape in round or square brackets in tf?; square brackets: e.g. [5, 5, 1, 32]. |
| 42 | + |
| 43 | +how can you turn dropout on during training and turn it off during testing?; create placeholder keep_prob=tf.placeholder(tf.float32), dropout = tf.nn.dropout(prev_layer, keep_prob) and feed dict corresponding values when training and testing (keep_prob=1). |
| 44 | + |
| 45 | +scaling used in tf.nn.dropout; output scaled up by 1/keep_prob, so expected sum is unchanged. |
| 46 | + |
| 47 | + |
| 48 | + |
| 49 | + |
| 50 | + |
0 commit comments