NumPy Interview Questions: Arrays, Broadcasting & More

May 29, 2026
35 min read

AI Insights

Powered by GPT-4o-mini

Verified Context: numpy-interview-questions-arrays-broadcasting-more
Quick Answer

Prepare for NumPy interviews with original questions and answers on ndarray basics, dtype, shape, strides, broadcasting, views vs copies, indexing, vectorization, random numbers, image arrays, structured arrays, file I/O, and practical coding tasks.

Quick Summary

Prepare for your NumPy interview with key questions on arrays, broadcasting, views, and practical coding examples. Ace your data science interview!

NumPy Interview Questions: Arrays, Broadcasting, Views, Copies, Random, and Practical Coding

NumPy interviews usually test more than function names.

They check whether you understand how arrays are stored, why vectorization is faster than Python loops, when slicing creates a view, when indexing creates a copy, how broadcasting works, and how to solve small data problems without writing unnecessary loops.

This guide is written as a practical interview-preparation file. Each answer is original and uses simple examples from analytics, student scores, product data, images, and machine learning workflows.

You will prepare for questions about:

  • ndarray, shape, dtype, ndim, size, itemsize, and strides
  • arrays vs Python lists
  • vectorization and ufuncs
  • views, copies, shallow-looking array behavior, and .base
  • basic indexing, boolean indexing, and fancy indexing
  • broadcasting rules and np.newaxis
  • axis-based aggregation
  • sorting, ranking, filtering, clipping, and set operations
  • random number generation with default_rng
  • allclose and floating-point comparison
  • meshgrid, swapaxes, tile, repeat, and count_nonzero
  • image-like arrays
  • structured arrays
  • saving and loading NumPy data
  • code-output and debugging interview tasks

How To Use This Guide

Read each question and try to answer it before reading the explanation.

For code-output questions, write the output on paper first. In interviews, the goal is not only to know NumPy functions. The goal is to reason from:

text
shape + dtype + axis + indexing rule

If you can explain those four things clearly, most NumPy interview questions become manageable.

1. What Is NumPy?

NumPy is a Python library for numerical computing.

Its main object is the ndarray, which stores values in a compact, typed, multidimensional array.

Interview answer:

NumPy is used for fast numerical operations on arrays. It provides vectorized operations, broadcasting, efficient memory storage, mathematical functions, random number generation, and linear algebra tools. Many libraries such as Pandas, scikit-learn, TensorFlow, PyTorch, and image-processing tools use NumPy-style arrays internally or at their boundaries.

Example:

python
import numpy as np

prices = np.array([100, 150, 200])
discounted = prices * 0.9

print(discounted)

Output:

text
[ 90. 135. 180.]

2. What Is An ndarray?

An ndarray is NumPy's n-dimensional array object.

It is usually:

  • multidimensional
  • homogeneous, meaning values normally share one dtype
  • fixed-size after creation
  • optimized for numerical operations

Example:

python
arr = np.array([[1, 2, 3], [4, 5, 6]])

print(type(arr))
print(arr.shape)
print(arr.dtype)

Output:

text
<class 'numpy.ndarray'>
(2, 3)
int64

The exact integer dtype may be int32 on some systems.

3. How Is A NumPy Array Different From A Python List?

A Python list stores references to Python objects. It can hold mixed types and can grow dynamically.

A NumPy array stores fixed-size values of a common dtype in a compact memory layout.

Interview answer:

Lists are general-purpose containers. NumPy arrays are specialized numerical containers. Arrays support vectorized operations, use memory more compactly for numerical data, and work naturally with multidimensional shapes.

Example:

python
print([1, 2, 3] * 2)
print(np.array([1, 2, 3]) * 2)

Output:

text
[1, 2, 3, 1, 2, 3]
[2 4 6]

4. What Do shape, ndim, size, dtype, And itemsize Mean?

Use this array:

python
data = np.array([
    [10, 20, 30],
    [40, 50, 60],
], dtype=np.int32)

Attributes:

python
print(data.shape)
print(data.ndim)
print(data.size)
print(data.dtype)
print(data.itemsize)

Output:

text
(2, 3)
2
6
int32
4

Meaning:

  • shape: size along each dimension
  • ndim: number of dimensions
  • size: total number of elements
  • dtype: data type of each element
  • itemsize: bytes used by one element

5. What Are Strides?

Strides tell NumPy how many bytes to move in memory to reach the next element along each axis.

Example:

python
arr = np.arange(12, dtype=np.int32).reshape(3, 4)

print(arr)
print(arr.strides)

Possible output:

text
[[ 0  1  2  3]
 [ 4  5  6  7]
 [ 8  9 10 11]]
(16, 4)

Why?

  • Each int32 value uses 4 bytes.
  • Moving one column moves 4 bytes.
  • Moving one row moves 4 columns x 4 bytes = 16 bytes.

Interview answer:

Strides describe how an n-dimensional index maps to the underlying memory buffer. They are one reason NumPy can create views, transpose arrays, and slice arrays without always copying data.

6. What Is A View?

A view is a new array object that looks at the same underlying data as another array.

Example:

python
arr = np.array([10, 20, 30, 40])
view = arr[1:3]

view[0] = 999

print(arr)
print(view)

Output:

text
[ 10 999  30  40]
[999  30]

Changing the view changed the original.

Interview answer:

A view shares the original array's data buffer. It can be faster and memory-efficient, but changes through one array may appear in the other.

7. What Is A Copy?

A copy owns separate data.

Example:

python
arr = np.array([10, 20, 30, 40])
copy_arr = arr[1:3].copy()

copy_arr[0] = 999

print(arr)
print(copy_arr)

Output:

text
[10 20 30 40]
[999  30]

The original did not change.

8. How Can You Check Whether An Array Is A View?

Use .base as a learning/debugging clue.

python
arr = np.arange(6)
view = arr[1:4]
copy_arr = arr[[1, 2, 3]]

print(view.base is arr)
print(copy_arr.base is None)

Output:

text
True
True

Important interview point:

  • basic slicing usually creates views
  • advanced indexing usually creates copies

9. What Is The Difference Between Basic Indexing And Advanced Indexing?

Basic indexing uses integers, slices, ellipsis, and None or np.newaxis.

Advanced indexing uses integer arrays or boolean arrays.

Example:

python
arr = np.arange(9).reshape(3, 3)

basic = arr[1:, :]
advanced = arr[[1, 2], :]

print(basic)
print(advanced)

Both may look similar, but their memory behavior differs.

Interview answer:

Basic slicing generally returns a view. Advanced indexing generally returns a copy. This matters for memory usage and whether changes affect the original array.

10. What Is Boolean Indexing?

Boolean indexing selects values where a condition is true.

python
scores = np.array([45, 72, 88, 39, 91])

passed = scores[scores >= 50]

print(passed)

Output:

text
[72 88 91]

The condition creates a boolean mask:

python
print(scores >= 50)

Output:

text
[False  True  True False  True]

11. What Is Fancy Indexing?

Fancy indexing selects values using arrays or lists of indexes.

python
scores = np.array([45, 72, 88, 39, 91])

selected = scores[[0, 2, 4]]

print(selected)

Output:

text
[45 88 91]

Fancy indexing is useful for selecting specific rows, columns, or records.

12. What Is Broadcasting?

Broadcasting is NumPy's rule for operating on arrays with different shapes.

Example:

python
matrix = np.array([
    [10, 20, 30],
    [40, 50, 60],
])

bonus = np.array([1, 2, 3])

print(matrix + bonus)

Output:

text
[[11 22 33]
 [41 52 63]]

The 1D bonus array is applied across each row.

Interview answer:

Broadcasting lets NumPy perform element-wise operations on compatible shapes without physically copying the smaller array across the larger one.

13. What Are The Broadcasting Rules?

Compare shapes from right to left.

Two dimensions are compatible if:

  • they are equal, or
  • one of them is 1

Example:

text
(4, 3)
(   3)

Compatible because the last dimension is 3.

Example:

text
(4, 3)
(4,)

Not compatible because the trailing dimensions are 3 and 4.

Code:

python
a = np.zeros((4, 3))
b = np.array([1, 2, 3])

print((a + b).shape)

Output:

text
(4, 3)

14. How Does np.newaxis Help Broadcasting?

np.newaxis adds a dimension of size 1.

python
row = np.array([1, 2, 3])
column = row[:, np.newaxis]

print(row.shape)
print(column.shape)

Output:

text
(3,)
(3, 1)

Create an outer addition table:

python
a = np.array([10, 20, 30])
b = np.array([1, 2, 3, 4])

result = a[:, np.newaxis] + b

print(result)

Output:

text
[[11 12 13 14]
 [21 22 23 24]
 [31 32 33 34]]

15. What Is Vectorization?

Vectorization means applying operations to entire arrays instead of writing Python loops over individual elements.

Loop version:

python
values = [10, 20, 30]
result = []

for value in values:
    result.append(value * 2)

NumPy version:

python
values = np.array([10, 20, 30])
result = values * 2

Interview answer:

Vectorization is faster because NumPy performs the loop in optimized compiled code and avoids much of the overhead of Python-level iteration.

16. What Are Ufuncs?

Ufunc means universal function.

Ufuncs apply element-wise operations efficiently.

Examples:

python
arr = np.array([1, 4, 9, 16])

print(np.sqrt(arr))
print(np.add(arr, 10))

Output:

text
[1. 2. 3. 4.]
[11 14 19 26]

Common ufuncs include:

  • np.add
  • np.subtract
  • np.multiply
  • np.divide
  • np.sqrt
  • np.exp
  • np.log
  • np.sin

17. What Is The Difference Between axis=0 And axis=1?

For a 2D array:

  • axis=0 works down rows and returns one value per column
  • axis=1 works across columns and returns one value per row

Example:

python
marks = np.array([
    [70, 80, 90],
    [60, 75, 85],
])

print(marks.sum(axis=0))
print(marks.sum(axis=1))

Output:

text
[130 155 175]
[240 220]

Interview shortcut:

The axis you pass is the axis that gets reduced.

18. Why Use keepdims=True?

keepdims=True keeps reduced axes as dimensions of size 1.

This is useful for broadcasting.

python
marks = np.array([
    [70, 80, 90],
    [60, 75, 85],
])

row_mean = marks.mean(axis=1, keepdims=True)
centered = marks - row_mean

print(row_mean.shape)
print(centered)

Output:

text
(2, 1)
[[-10.   0.  10.]
 [-13.33333333   1.66666667  11.66666667]]

Without keepdims=True, broadcasting may fail or mean something different.

19. What Is reshape()?

reshape() changes the shape of an array without changing the number of elements.

python
arr = np.arange(12)
matrix = arr.reshape(3, 4)

print(matrix)

Output:

text
[[ 0  1  2  3]
 [ 4  5  6  7]
 [ 8  9 10 11]]

This fails:

python
np.arange(12).reshape(5, 3)

because 12 values cannot fill 15 positions.

20. Does reshape() Return A View Or A Copy?

Often, reshape() can return a view, but it depends on memory layout.

Interview answer:

reshape() returns a view when possible. If the requested shape cannot be represented with compatible strides, NumPy may need a copy or may raise an error in some in-place reshape situations.

Practical advice:

python
arr = np.arange(12)
reshaped = arr.reshape(3, 4)

print(reshaped.base is arr)

Use .base only as a learning/debugging tool, not as business logic.

21. What Is The Difference Between ravel() And flatten()?

Both convert an array to 1D.

Important difference:

  • ravel() returns a view when possible
  • flatten() always returns a copy

Example:

python
matrix = np.arange(6).reshape(2, 3)

flat_view = matrix.ravel()
flat_copy = matrix.flatten()

flat_view[0] = 999
flat_copy[1] = 888

print(matrix)

Output:

text
[[999   1   2]
 [  3   4   5]]

flat_copy did not affect the original.

22. What Is The Difference Between transpose, .T, And swapaxes?

For 2D arrays, .T and transpose() both swap rows and columns.

python
matrix = np.array([
    [1, 2, 3],
    [4, 5, 6],
])

print(matrix.T)

Output:

text
[[1 4]
 [2 5]
 [3 6]]

For higher dimensions, swapaxes() swaps two chosen axes.

python
arr = np.zeros((2, 3, 4))

print(np.swapaxes(arr, 0, 2).shape)

Output:

text
(4, 3, 2)

Interview answer:

.T reverses axes. transpose() can reorder axes explicitly. swapaxes() swaps exactly two axes.

23. What Is np.expand_dims()?

np.expand_dims() inserts a new axis.

python
arr = np.array([10, 20, 30])

row = np.expand_dims(arr, axis=0)
column = np.expand_dims(arr, axis=1)

print(row.shape)
print(column.shape)

Output:

text
(1, 3)
(3, 1)

It is commonly used when a model expects a batch dimension.

24. What Is np.squeeze()?

np.squeeze() removes axes of length 1.

python
arr = np.zeros((1, 3, 1, 4))

print(np.squeeze(arr).shape)

Output:

text
(3, 4)

Use it carefully. Removing a batch dimension accidentally can break model input shapes.

25. What Is The Difference Between np.concatenate, vstack, hstack, And stack?

concatenate joins arrays along an existing axis.

python
a = np.array([[1, 2]])
b = np.array([[3, 4]])

print(np.concatenate((a, b), axis=0))

Output:

text
[[1 2]
 [3 4]]

vstack stacks vertically.

hstack stacks horizontally.

stack creates a new axis.

python
x = np.array([1, 2])
y = np.array([3, 4])

print(np.stack((x, y), axis=0))
print(np.stack((x, y), axis=1))

Output:

text
[[1 2]
 [3 4]]
[[1 3]
 [2 4]]

Interview answer:

Use concatenate when joining along an existing dimension. Use stack when creating a new dimension.

26. What Is The Difference Between np.tile() And np.repeat()?

tile() repeats the whole array pattern.

python
arr = np.array([1, 2, 3])

print(np.tile(arr, 2))

Output:

text
[1 2 3 1 2 3]

repeat() repeats individual elements.

python
print(np.repeat(arr, 2))

Output:

text
[1 1 2 2 3 3]

For 2D arrays, axis controls the direction for repeat.

python
matrix = np.array([[1, 2], [3, 4]])

print(np.repeat(matrix, 2, axis=0))

Output:

text
[[1 2]
 [1 2]
 [3 4]
 [3 4]]

27. What Is np.where()?

np.where() has two common uses.

Find positions:

python
scores = np.array([45, 80, 62, 30])

print(np.where(scores >= 60))

Output:

text
(array([1, 2]),)

Choose values conditionally:

python
labels = np.where(scores >= 60, "pass", "retry")

print(labels)

Output:

text
['retry' 'pass' 'pass' 'retry']

28. What Is np.clip()?

np.clip() limits values to a minimum and maximum.

python
values = np.array([-5, 10, 50, 120])

print(np.clip(values, 0, 100))

Output:

text
[  0  10  50 100]

Use it for outlier control, image pixel limits, probability bounds, and safe feature ranges.

29. What Is np.count_nonzero()?

It counts non-zero values.

python
arr = np.array([
    [1, 0, 3],
    [0, 0, 6],
])

print(np.count_nonzero(arr))
print(np.count_nonzero(arr, axis=0))
print(np.count_nonzero(arr, axis=1))

Output:

text
3
[1 0 2]
[2 1]

It is often used to count true values because True behaves like 1 and False like 0.

python
scores = np.array([45, 80, 62, 30])

print(np.count_nonzero(scores >= 60))

Output:

text
2

30. What Is np.allclose() And Why Is It Important?

Floating-point values can have tiny precision differences.

Do not compare floats using exact equality when small numerical error is expected.

python
a = np.array([0.1 + 0.2])
b = np.array([0.3])

print(a == b)
print(np.allclose(a, b))

Output:

text
[False]
True

Interview answer:

np.allclose() checks whether arrays are element-wise equal within a tolerance. It is useful for testing numerical code where tiny floating-point differences are acceptable.

31. What Is The Difference Between np.random.seed() And default_rng()?

np.random.seed() controls legacy global random state.

Modern NumPy code should prefer np.random.default_rng().

python
rng = np.random.default_rng(42)

print(rng.integers(1, 10, size=5))

Interview answer:

default_rng() creates an independent random generator object. It avoids relying on shared global state and is the recommended approach for new code.

32. How Do You Generate Random Integers, Uniform Values, And Normal Values?

python
rng = np.random.default_rng(7)

integers = rng.integers(1, 101, size=(2, 3))
uniform_values = rng.uniform(0, 1, size=5)
normal_values = rng.normal(loc=0, scale=1, size=5)

print(integers)
print(uniform_values)
print(normal_values)

Use:

  • integers for random integer ranges
  • uniform for continuous values in a range
  • normal for Gaussian-like data

33. What Is The Difference Between shuffle() And choice()?

shuffle() rearranges an array in place.

python
rng = np.random.default_rng(10)
arr = np.array([1, 2, 3, 4, 5])

rng.shuffle(arr)

print(arr)

choice() samples values.

python
rng = np.random.default_rng(10)
arr = np.array([1, 2, 3, 4, 5])

print(rng.choice(arr, size=3, replace=False))

Use replace=False when the same item should not be selected twice.

34. What Is np.meshgrid()?

meshgrid() creates coordinate grids from coordinate vectors.

python
x = np.array([0, 1, 2])
y = np.array([10, 20])

xx, yy = np.meshgrid(x, y)

print(xx)
print(yy)

Output:

text
[[0 1 2]
 [0 1 2]]
[[10 10 10]
 [20 20 20]]

Interview answer:

meshgrid() is useful for evaluating a function on a 2D grid, plotting surfaces, creating coordinate maps, or generating image-style coordinate arrays.

35. What Are Structured Arrays?

Structured arrays let each element contain named fields.

python
students = np.array(
    [
        ("Asha", 92, 8.7, True),
        ("Ravi", 78, 7.9, False),
    ],
    dtype=[
        ("name", "U20"),
        ("score", "i4"),
        ("cgpa", "f4"),
        ("placed", "?"),
    ],
)

print(students["name"])
print(students["score"])

Output:

text
['Asha' 'Ravi']
[92 78]

Interview answer:

Structured arrays are useful when each record has named fields, but for general tabular analytics Pandas is often more convenient.

36. How Are Images Represented As NumPy Arrays?

A grayscale image can be a 2D array:

text
(height, width)

A color image is often a 3D array:

text
(height, width, channels)

For RGB images, channels are usually 3.

Common operations:

python
image = np.zeros((100, 200, 3), dtype=np.uint8)

print(image.shape)
print(image.dtype)

Output:

text
(100, 200, 3)
uint8

Examples:

python
flipped_vertical = np.flip(image, axis=0)
flipped_horizontal = np.flip(image, axis=1)
darkened = np.clip(image * 0.7, 0, 255).astype(np.uint8)
negative = 255 - image
cropped = image[20:80, 50:150]

37. What Is The Difference Between np.save, np.load, And np.savetxt?

np.save() stores one array in NumPy's binary .npy format.

python
arr = np.array([1, 2, 3])

np.save("numbers.npy", arr)
loaded = np.load("numbers.npy")

print(loaded)

np.savetxt() stores text data such as CSV-like output.

Binary .npy is usually better for preserving dtype and shape.

Use np.savez() or np.savez_compressed() for multiple arrays.

38. Code Output: Slicing View

Question:

python
arr = np.array([10, 20, 30, 40])
view = arr[1:3]
view[1] = 999

print(arr)

Answer:

text
[ 10  20 999  40]

Explanation:

view shares data with arr. view[1] corresponds to arr[2].

39. Code Output: Fancy Indexing Copy

Question:

python
arr = np.array([10, 20, 30, 40])
selected = arr[[1, 2]]
selected[0] = 999

print(arr)
print(selected)

Answer:

text
[10 20 30 40]
[999  30]

Fancy indexing returned a copy.

40. Code Output: Broadcasting

Question:

python
a = np.array([[1], [2], [3]])
b = np.array([10, 20, 30, 40])

print((a + b).shape)
print(a + b)

Answer:

text
(3, 4)
[[11 21 31 41]
 [12 22 32 42]
 [13 23 33 43]]

Shapes:

text
(3, 1)
(4,)

Broadcast to:

text
(3, 4)

41. Code Output: Axis Reduction

Question:

python
arr = np.array([
    [1, 2, 3],
    [4, 5, 6],
])

print(arr.sum(axis=0))
print(arr.sum(axis=1))

Answer:

text
[5 7 9]
[ 6 15]

42. Code Output: tile vs repeat

Question:

python
arr = np.array([1, 2, 3])

print(np.tile(arr, 2))
print(np.repeat(arr, 2))

Answer:

text
[1 2 3 1 2 3]
[1 1 2 2 3 3]

43. Code Output: allclose

Question:

python
a = np.array([0.1 + 0.2])
b = np.array([0.3])

print(a == b)
print(np.allclose(a, b))

Answer:

text
[False]
True

The exact binary representation of decimal fractions can produce tiny differences.

44. Debugging: Why Does This Broadcasting Fail?

Question:

python
sales = np.zeros((4, 3))
bonus = np.array([1, 2, 3, 4])

sales + bonus

Answer:

This fails because shapes are:

text
(4, 3)
(4,)

Broadcasting compares from the right:

text
3 vs 4

They are not equal, and neither is 1.

Fix by making bonus a column:

python
bonus = bonus.reshape(4, 1)
print((sales + bonus).shape)

Output:

text
(4, 3)

45. Debugging: Why Did My Original Array Change?

Question:

python
data = np.arange(10)
part = data[2:5]
part[:] = -1

print(data)

Answer:

part is a view created by slicing, so it shares memory with data.

Output:

text
[ 0  1 -1 -1 -1  5  6  7  8  9]

Fix:

python
part = data[2:5].copy()

46. Debugging: Why Is arr == np.nan Always False?

NaN is not equal to itself.

python
arr = np.array([1.0, np.nan, 3.0])

print(arr == np.nan)

Output:

text
[False False False]

Correct:

python
print(np.isnan(arr))

Output:

text
[False  True False]

47. Debugging: Why Did Integer Division Become Float?

python
arr = np.array([1, 2, 3])

print((arr / 2).dtype)
print(arr // 2)

Output:

text
float64
[0 1 1]

/ performs true division and can produce floats. // performs floor division.

48. Coding Task: Normalize Each Row

Question:

Normalize each row using:

text
(row - row_min) / (row_max - row_min)

Solution:

python
data = np.array([
    [10, 20, 30],
    [2, 4, 8],
    [100, 150, 200],
])

row_min = data.min(axis=1, keepdims=True)
row_max = data.max(axis=1, keepdims=True)

normalized = (data - row_min) / (row_max - row_min)

print(normalized)

Output:

text
[[0.  0.5 1. ]
 [0.  0.33333333 1. ]
 [0.  0.5 1. ]]

49. Coding Task: Find Rows With Any Value Greater Than X

python
arr = np.array([
    [1, 2, 3],
    [10, 2, 1],
    [3, 9, 4],
])

x = 6

rows = np.where((arr > x).any(axis=1))[0]

print(rows)

Output:

text
[1 2]

50. Coding Task: Remove Minimum And Maximum Values

Remove every occurrence of the minimum and maximum values.

python
arr = np.array([4, 9, 1, 3, 9, 2, 1, 7])

minimum = arr.min()
maximum = arr.max()

result = arr[(arr != minimum) & (arr != maximum)]

print(result)

Output:

text
[4 3 2 7]

51. Coding Task: Sort Rows By Second Column

python
data = np.array([
    [101, 75],
    [102, 92],
    [103, 60],
])

sorted_rows = data[np.argsort(data[:, 1])]

print(sorted_rows)

Output:

text
[[103  60]
 [101  75]
 [102  92]]

Descending:

python
sorted_rows_desc = data[np.argsort(data[:, 1])[::-1]]

52. Coding Task: Add Total Column And Get Top 2

python
marks = np.array([
    [70, 80, 90],
    [60, 75, 85],
    [95, 91, 93],
    [50, 65, 70],
])

total = marks.sum(axis=1, keepdims=True)
with_total = np.concatenate((marks, total), axis=1)

ranked = with_total[np.argsort(with_total[:, -1])[::-1]]

print(ranked[:2])

Output:

text
[[ 95  91  93 279]
 [ 70  80  90 240]]

53. Coding Task: Unique Rows

python
records = np.array([
    [1, 10],
    [2, 20],
    [1, 10],
    [3, 30],
])

print(np.unique(records, axis=0))

Output:

text
[[ 1 10]
 [ 2 20]
 [ 3 30]]

54. Coding Task: Count Category Frequencies

python
labels = np.array(["free", "pro", "free", "team", "pro", "free"])

categories, counts = np.unique(labels, return_counts=True)

print(categories)
print(counts)

Output:

text
['free' 'pro' 'team']
[3 2 1]

55. Coding Task: Build A Distance Matrix

Given points on a line:

python
points = np.array([1, 4, 9])

Create pairwise absolute distances.

python
distance = np.abs(points[:, np.newaxis] - points[np.newaxis, :])

print(distance)

Output:

text
[[0 3 8]
 [3 0 5]
 [8 5 0]]

This uses broadcasting.

56. Coding Task: Euclidean Distance From A Target Point

python
points = np.array([
    [2, 3],
    [5, 7],
    [1, 8],
])

target = np.array([3, 4])

distances = np.sqrt(((points - target) ** 2).sum(axis=1))

print(distances)

Output:

text
[1.41421356 3.60555128 4.47213595]

57. Coding Task: Create A Checkerboard Matrix

python
board = np.zeros((6, 6), dtype=int)
board[::2, ::2] = 1
board[1::2, 1::2] = 1

print(board)

Output:

text
[[1 0 1 0 1 0]
 [0 1 0 1 0 1]
 [1 0 1 0 1 0]
 [0 1 0 1 0 1]
 [1 0 1 0 1 0]
 [0 1 0 1 0 1]]

58. Coding Task: Replace Outliers With Boundary Values

python
values = np.array([5, 12, 40, 99, 120, -3])

cleaned = np.clip(values, 0, 100)

print(cleaned)

Output:

text
[  5  12  40  99 100   0]

59. Coding Task: Find Common Product IDs

python
batch_a = np.array([101, 102, 103, 104])
batch_b = np.array([103, 104, 105, 106])

print(np.intersect1d(batch_a, batch_b))
print(np.setdiff1d(batch_a, batch_b))
print(np.union1d(batch_a, batch_b))

Output:

text
[103 104]
[101 102]
[101 102 103 104 105 106]

60. Coding Task: Use meshgrid To Evaluate A Function

python
x = np.array([0, 1, 2])
y = np.array([10, 20])

xx, yy = np.meshgrid(x, y)

z = xx + yy

print(z)

Output:

text
[[10 11 12]
 [20 21 22]]

61. Interview Answer: How Would You Improve Slow NumPy Code?

Strong answer:

First, I would check whether the code is using Python loops over array elements. Then I would look for vectorization, broadcasting, ufuncs, axis-based reductions, and boolean masks. I would also avoid repeated appends inside loops because NumPy arrays are fixed-size; it is better to collect data first or preallocate the final array. Finally, I would check unnecessary copies, dtype choices, and memory layout if performance still matters.

62. Interview Answer: When Should You Not Use NumPy?

Strong answer:

NumPy is not ideal for mixed object-heavy data, heavily nested Python objects, row-by-row business logic, or datasets too large for memory unless paired with chunking or other tools. For labeled tabular data, Pandas is often more ergonomic. For GPU tensor work, PyTorch, TensorFlow, JAX, or CuPy may be better depending on the project.

63. Interview Answer: Why Can Broadcasting Be Dangerous?

Broadcasting can silently create a result with a valid but unintended shape.

Example:

python
a = np.ones((3, 1))
b = np.ones((1, 4))

print((a + b).shape)

Output:

text
(3, 4)

This is correct mathematically, but if you expected a 1D result, it is a bug.

Good habit:

python
print(a.shape, b.shape)

before combining arrays.

64. Interview Answer: Why Can Copies Hurt Performance?

Copies use extra memory and time.

If you slice a huge array and can work with a view safely, it can be faster and more memory-efficient.

But views can cause accidental mutation.

Strong answer:

Views are efficient but share data. Copies are safer when independence matters. The right choice depends on whether the downstream code should be allowed to affect the original data.

65. Interview Answer: Why Does dtype Matter?

dtype controls:

  • memory usage
  • numerical range
  • precision
  • operation results
  • compatibility with libraries

Example:

python
a = np.array([1, 2, 3], dtype=np.int8)
b = np.array([1, 2, 3], dtype=np.float64)

print(a.itemsize)
print(b.itemsize)

Output:

text
1
8

Using a smaller dtype can save memory, but it can also overflow if values exceed the dtype range.

66. Quick Revision Table

TopicInterview point
ndarraytyped, multidimensional array
shapesize of each dimension
dtypetype and storage format of elements
stridesbytes to move along each axis
viewshares data
copyowns separate data
basic slicingusually view
advanced indexingusually copy
broadcastingcompatible shape expansion without manual loops
axisdimension being reduced or operated along
keepdimskeeps reduced axes for broadcasting
ravelview when possible
flattencopy
default_rngrecommended random generator constructor
allclosetolerance-based float comparison
tilerepeats whole pattern
repeatrepeats individual elements
meshgridcoordinate grids
structured arrayrecords with named fields

67. Rapid-Fire Interview Questions

1. What is NumPy mainly used for?

Fast numerical work with arrays.

2. What is the main NumPy object?

ndarray.

3. What does shape return?

A tuple showing the size of each dimension.

4. What does dtype tell you?

The type and storage format of each array element.

5. What does axis=0 mean in a 2D aggregation?

Reduce down rows and return one result per column.

6. What does axis=1 mean in a 2D aggregation?

Reduce across columns and return one result per row.

7. Does slicing copy data?

Basic slicing usually returns a view.

8. Does fancy indexing copy data?

Usually yes.

9. Why use copy()?

To avoid changing the original array when modifying selected data.

10. Why use np.allclose()?

To compare floating-point arrays with tolerance.

11. What is broadcasting?

Automatic shape compatibility for element-wise operations.

12. What is vectorization?

Using array operations instead of Python loops.

13. Why is vectorization faster?

The loop runs in optimized compiled code with less Python overhead.

14. What is np.where()?

A conditional selection function or a way to find matching positions.

15. What is np.argmax()?

It returns the index of the maximum value.

16. What is np.argmin()?

It returns the index of the minimum value.

17. What is np.unique(..., return_counts=True) used for?

Finding unique values and their frequencies.

18. What is np.clip() used for?

Limiting values to a minimum and maximum range.

19. What is np.meshgrid() used for?

Creating coordinate grids.

20. What is a structured array?

An array with named fields inside each record.

68. Practice Interview Set

Try these without looking at the answers first.

Question 1

Explain why NumPy arrays are faster than Python lists for numerical operations.

Question 2

Given an array with shape (5, 1) and another with shape (3,), what is the result shape after addition?

Question 3

What happens when you modify an array slice?

Question 4

Write code to select rows where any value is negative.

Question 5

Write code to normalize each column.

Question 6

Write code to get the top 3 values from a 1D array.

Question 7

Write code to find duplicate values in an array.

Question 8

Write code to replace NaN values with zero.

Question 9

Write code to create a 5 by 5 identity matrix.

Question 10

Write code to save and load a NumPy array.

69. Practice Interview Answers

Solution Key

Answer 1

NumPy arrays are faster because values are stored in a compact typed buffer, and operations run in optimized compiled code instead of Python-level loops.

Solution Key

Answer 2

Shapes:

text
(5, 1)
(3,)

Result:

text
(5, 3)

Solution Key

Answer 3

Basic slices usually create views, so modifying the slice can modify the original array.

Solution Key

Answer 4

python
arr = np.array([
    [1, 2, 3],
    [4, -1, 6],
    [7, 8, 9],
])

rows = arr[(arr < 0).any(axis=1)]

print(rows)

Explanation

  • A 2D NumPy array arr is created with integers, including a negative value (-1).
  • The expression (arr < 0).any(axis=1) generates a boolean array indicating which rows contain at least one negative value.
  • The original array arr is indexed with this boolean array to extract the rows that meet the condition.
  • The resulting rows are stored in the variable rows and printed, showing only the rows with negative values.

Solution Key

Answer 5

python
data = np.array([
    [10, 100],
    [20, 150],
    [30, 200],
])

col_min = data.min(axis=0, keepdims=True)
col_max = data.max(axis=0, keepdims=True)

normalized = (data - col_min) / (col_max - col_min)

print(normalized)

Explanation

  • The code initializes a 2D NumPy array named data with specific values.
  • It calculates the minimum values for each column using data.min(axis=0, keepdims=True), preserving the array's dimensions.
  • Similarly, it computes the maximum values for each column with data.max(axis=0, keepdims=True).
  • The normalization formula (data - col_min) / (col_max - col_min) is applied to scale the data to a range between 0 and 1.
  • Finally, the normalized array is printed, showing the transformed values.

Solution Key

Answer 6

python
arr = np.array([12, 99, 4, 42, 18, 77])

top_3 = np.sort(arr)[-3:][::-1]

print(top_3)

Explanation

  • The code initializes a NumPy array arr containing six integer values.
  • It sorts the array in ascending order using np.sort(arr).
  • The last three elements of the sorted array, which are the highest values, are selected with [-3:].
  • The selected values are then reversed to present them in descending order using [::-1].
  • Finally, the top three values are printed to the console.

Solution Key

Answer 7

python
arr = np.array([1, 2, 2, 3, 4, 4, 4])

values, counts = np.unique(arr, return_counts=True)
duplicates = values[counts > 1]

print(duplicates)

Explanation

  • The code initializes a NumPy array arr containing integers, some of which are duplicated.
  • It uses np.unique() to find unique values in the array while also counting their occurrences, returning two arrays: values and counts.
  • The duplicates array is created by filtering values where the corresponding counts are greater than 1, indicating duplicates.
  • Finally, it prints the duplicates array, which contains the values that appear more than once in the original array.

Solution Key

Answer 8

python
arr = np.array([1.0, np.nan, 3.0, np.nan])

cleaned = np.where(np.isnan(arr), 0, arr)

print(cleaned)

Explanation

  • The code initializes a NumPy array arr containing floating-point numbers, including NaN values.
  • It uses np.isnan(arr) to create a boolean mask identifying the NaN elements in the array.
  • The np.where function replaces NaN values with 0 while keeping other values unchanged.
  • The resulting array cleaned is printed, showing the original values with NaN replaced by 0.

Alternative:

python
cleaned = np.nan_to_num(arr, nan=0.0)

Explanation

  • Utilizes the np.nan_to_num() function from the NumPy library to handle NaN values.
  • The input arr is a NumPy array that may contain NaN (Not a Number) entries.
  • Any NaN values found in arr are replaced with 0.0, ensuring the output array cleaned has no NaN values.
  • This is useful for data preprocessing, especially before performing mathematical operations or analyses that cannot handle NaN values.

Solution Key

Answer 9

python
identity = np.eye(5)

print(identity)

Explanation

  • The code utilizes the NumPy library, which is commonly used for numerical operations in Python.
  • np.eye(5) creates a 5x5 identity matrix, where all the diagonal elements are 1 and all other elements are 0.
  • The print(identity) statement outputs the generated identity matrix to the console.
  • Identity matrices are useful in various mathematical computations, including linear algebra and transformations.

Solution Key

Answer 10

python
arr = np.array([[1, 2], [3, 4]])

np.save("arr.npy", arr)
loaded = np.load("arr.npy")

print(loaded)

Explanation

  • The code creates a 2D NumPy array named arr containing the values [[1, 2], [3, 4]].
  • It uses np.save to save the array to a file called "arr.npy" in binary format.
  • The array is then loaded back into memory using np.load, retrieving the saved data into the variable loaded.
  • Finally, the loaded array is printed to the console, displaying its contents.

70. Common Mistakes To Avoid

Mistake 1: Not checking shapes

Most NumPy bugs are shape bugs.

Always inspect:

python
print(arr.shape)

Explanation

  • The code uses the print function to output information to the console.
  • arr.shape accesses the shape attribute of a NumPy array, which returns a tuple representing the dimensions of the array.
  • This is useful for understanding the structure of the data, such as the number of rows and columns in a 2D array.
  • The output will vary depending on the specific shape of the arr array being analyzed.

Mistake 2: Confusing views and copies

If you modify a slice and the original changes, you probably had a view.

Use:

python
arr[2:5].copy()

Explanation

  • The code accesses a portion of the list arr from index 2 to index 4 (5 is exclusive).
  • The copy() method is called on the sliced portion, ensuring that a new list is created rather than a reference to the original.
  • This is useful for modifying the copied list without affecting the original list.
  • The resulting copied list contains the elements from the specified range of the original list.

when independence matters.

Mistake 3: Comparing floats with exact equality

Use:

python
np.allclose(a, b)

Explanation

  • Utilizes the NumPy library function np.allclose() to compare two arrays, a and b.
  • Returns True if all elements of the arrays are equal within a specified tolerance, otherwise returns False.
  • Useful for numerical comparisons where floating-point precision issues may arise.
  • The function allows for customization of relative and absolute tolerances through optional parameters.

when tiny numerical differences are acceptable.

Mistake 4: Using loops for simple array operations

Prefer:

python
arr * 2
arr[arr > 0]
arr.sum(axis=1)

Explanation

  • The expression arr * 2 scales each element of the array arr by a factor of 2, effectively doubling its values.
  • The expression arr[arr > 0] filters the array to include only the elements that are greater than zero, creating a new array with positive values.
  • The method arr.sum(axis=1) computes the sum of elements along the specified axis (rows in this case), returning a new array with the sum of each row's elements.

over manual loops when possible.

Mistake 5: Using np.append() repeatedly in a loop

NumPy arrays are fixed-size. Repeated appends create repeated allocations.

Better approaches:

  • collect values in a Python list, then convert once
  • preallocate the final NumPy array
  • use concatenate once when possible

Final Summary

For NumPy interviews, remember these core ideas:

  • ndarray is a typed multidimensional array.
  • Shape tells you structure; dtype tells you storage and numerical behavior.
  • Strides explain how NumPy walks through memory.
  • Basic slicing usually creates views.
  • Advanced indexing usually creates copies.
  • Broadcasting compares shapes from right to left.
  • Vectorization avoids Python-level loops.
  • axis tells NumPy which dimension to operate over or reduce.
  • keepdims=True keeps dimensions useful for broadcasting.
  • default_rng() is preferred for modern random number generation.
  • allclose() is better than exact equality for many floating-point checks.
  • Image data, structured records, and grids are all natural NumPy use cases.

The best interview answers are short, accurate, and supported by a small example.

Sources and Further Reading